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ABSTRACT 


This  thesis  presents  a  parallel-processor-based  acoustic  ray  tracing 
algorithm  for  use  in  predicting  multipath  arrival  times  and  amplitudes,  for 
use  in  ocean  acoustic  tomography  experiments.  The  Runge-Kutta-Fehlberg 
numerical  integration  method  was  chosen,  out  of  three  other  methods,  to 
numerically  solve  the  ray  equations.  Cubic  splines  were  used  to  interpolate 
the  sound  speed  profile  and  bottom  bathymetry  data.  The  method  of 
Gaussian  beam  tracing  was  used  to  compute  the  multipath  amplitudes  at  a 
given  receiver  location.  The  ray  tracing  algorithm  is  written  in  C,  and  is 
designed  to  run  using  a  Macintosh  II-based  host  application  and  a 
transputer-based,  parallel  processing  workfarm. 
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I. 


INTRODUCTION 


A.  OCEAN  ACOUSTIC  TOMOGRAPHY 

“Ocean  acoustic  tomography  is  a  technique  for  observing  the  dynamic 
behaviour  of  ocean  processes  by  measuring  the  changes  in  travel  time  of 
acoustic  signals  transmitted  over  a  number  of  ocean  paths."  (Spindel,  1 986, 
pp.  7-13)  Ocean  acoustic  tomography  has  been  concerned  with  measuring  the 
mesoscale  fluctuations  and  features  of  the  ocean  which  are  characterized  bv 
dimensions  in  the  hundreds  of  kilometers  and  time  scales  on  the  order  of 
months. 

The  term  tomography  is  derived  from  the  Greek  word  “tomo"  which 
means  “cut"  or  “slice".  Ocean  acoustic  tomography  was  originally  proposed 
by  Walter  Munk  and  Carl  Wunsch  and  methods  for  inverting  observed  data 
to  obtain  sound  speed  fluctuations  were  presented  in  Munk  (1979).  The  speed 
at  which  sound  travels  through  the  ocean  is  a  function  of  many  factors 
including  temperature,  salinity  and  pressure.  In  addition,  travel  times  can  be 
affected  by  currents.  By  looking  at  a  “slice"  of  the  ocean  between  severai 
points  (moorings)  using  acoustic  transmissions,  the  travel  time  information 
obtained  is  used  to  estimate  these  fluctuations  in  ocean  variables.  This  is 
done  using  inverse  techniques  in  a  fashion  similar  to  that  done  in  Computer 
Assisted  Tomography  (CA7)  where  X-rays  are  used  to  yield  a  two- 
dimensional  view  of  the  interior  of  the  human  anatomy. 

Measurements  in  the  ocean  can  be  made  using  a  number  of  acoustic 
sources  (S)  and  receivers  ( R )  as  depicted  in  Figure  1.1.  Whereas  spot 
measurements  would  yield  S  +  R  pieces  of  information  and  would  be 
contaminated  by  small  scale  fluctuations,  the  tomography  approach  is 
multiplicative  ana  yields  SxR  pieces  of  information.  Also  measurements  of 
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travel  time  tend  to  be  spatially  integrating  which  effectively  smoothes  out  the 
small  scale  fluctuations.  (Munk,  1979,  p.  124) 


S, 


Figure  1.1  Example  Source  and  Receiver  Array 


Of  importance  in  ocean  acoustic  tomography  is  the  ability  to  estimate  the 
possible  acoustic  paths  (multipaths)  between  a  source  and  receiver  and  the 
information  such  as  travel  time  and  energy  associated  with  them.  One  such 
method  is  ray  tracing  which  uses  geometrical  principles  to  determine  the 
paths  taken  by  the  acoustic  energy.  Although  ray  tracing  is  based  on  ertain 
limiting  assumptions,  it  provides  an  intuitive  and  visual  means  of 
computing  the  acoustic  field  and  the  paths  followed  by  the  acoustic  wave 
fronts. 

Recently,  it  has  fc  an  suggested  that  tomography  could  monitor  the 
circulation  of  Monteiey  Bay,  California  (Miller  et  al.,  1989).  The  application  of 
traditional  ray  tracing  programs  has  been  hampered  by  their  inability  to 
model  propagation  through  the  extreme  bathymetry  of  the  Monterey  Bay 
submarine  canyon.  The  tradbional  programs  have  relied  on  the 
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determination  of  eigenrays  (rays  which  connect  source  and  receiver)  by 
shooting  methods.  Rays  are  traced  from  the  source  at  specified  angles  and 
travel  past  the  receiver  range.  Eigenrays  can  be  estimated  by  interpolating 
between  rays  which  bracket  the  receiver  depth.  This  is  a  very  computer¬ 
intensive  operation  for  typical  deep  ocean  tomography  scenarios.  The 
extreme  bathymetry  of  Monterey  Bay  makes  it  even  more  so.  This  thesis 
attempts  to  make  this  modelli.  7  problem  more  tractable  through  the  use  of 
Gaussian  beam  tracing  and  with  a  parallel-processor-based  workfarm  system 
running  in  a  Macintosh  II  desktop  computer. 

B.  THESIS  ORGANIZATION 

This  thesis  is  organized  into  eight  chapters.  Chapter  II  discusses  the  basic 
theory  of  geometrical  acoustics,  or  ray  tracing.  Some  basic  equations  are 
derived  and  common  methods  for  solving  these  equations  are  given.  A 
method  of  using  cubic  splines  to  interpolate  sound  speed  profile  data  is 
presented.  Chapter  III  presents  four  numerical  integration  techniques  that 
were  considered  for  this  thesis  to  solve  the  basic  ray  equation.  A  brief 
discussion  of  numerical  integration  and  each  of  the  methods  is  given.  The 
integration  methods  were  tested  using  a  simple  ray  tracing  problem  to  see 
which  method  was  most  suitable.  The  results  of  this  test  are  presented. 

Chapter  IV  gives  a  discussion  of  a  relatively  new  method  for  computing 
acoustic  fields  in  the  ocean  -  Gaussian  beam  tracing.  This  method  associates 
with  each  ray  path,  a  beam  with  a  Gaussian  distribution,  and  can  be  used  to 
scale  other  acoustic  quantities  s"ch  as  intensity  and  pressure.  The  Gaussian 
beam  method  is  also  free  of  certain  artifacts  which  are  present  in 
conventional  ray  tracing  techniques,  such  as  infinite  energy  at  caustics  and 
perfect  shadow  zones. 

Some  basic  concepts  about  parallel  processing  are  presented  in 
Chapter  V.  The  architecture  of  the  T800  transputer  and  a  discussion  about  the 
arallel  processor  workfarm  is  also  presented.  Chapter  VI  presents  some 
implementation  aspects  of  this  thesis.  Chapter  VII  presents  some  results 
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obtained  using  the  ray  tracing  and  gaussian  beam  algorithms  and  results  from 
the  parallel  processing  optimization.  Chapter  VIII  presents  conclusions  and 
recommendations  for  further  work  in  this  area. 
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II.  RAY  TRACING 


A.  RAY  THEORY 

The  linearized,  lossless  wave  equation  for  the  propagation  of  sound  in 
fluids  is  given  by 


c  at2' 


(2.1) 


where  c  is  the  phase  speed  of  acoustic  waves  in  the  fluid  and  p  is  the  acoustic 
pressure.  Both  c  and  p  are  functions  of  position  (i.e.,  c  =  c(x,y,z),  p  =  p(x,y,z)). 
It  is  often  useful  to  think  of  the  propagation  of  sound  in  terms  of  rays  instead 
of  plane  waves.  Rays  can  be  defined  as  lines  which  are  perpendicular 
everywhere  to  surfaces  of  constant  phase  (Kinsler,  1982,  p.  117).  In  many  cases 
rays  are  easier  to  work  with  than  waves;  however  rays  are  approximations 
and  are  only  valid  under  certain  conditions.  One  possible  solution  to  the 
wave  equation  which  leads  to  the  ray  approximation  is  given  by 


p(x,y,z,t)  =  A(x,y,z)  e 


iO)[t  -  r(x,y,z)/c0] 


(2.2) 


where  A  is  the  pressure  amplitude,  T  is  a  function  with  units  of  length  and  co 
is  a  constant  value  of  phase  speed.  Surfaces  of  constant  phase  are  defined  by 
values  of  (x,y,z)  such  that  T  is  constant.  The  quantity  VT  is  therefore 
perpendicular  to  these  surfaces  of  constant  phase.  Substituting  Equation  (2.2) 
into  the  wave  equation  yields 


V2A 

A 


f\2  vr.vr  + 


vr  +  vt  =  o. 


(2.3) 
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If  A  and  Vr  vary  slowly  such  that 


V2A 


«(g)/c)2,  IV2ri«a)/C/ 


VA 


vr 


«  w/c  ,  (2.4) 


then  Equation  (2.3)  simplifies  to  the  Eikonal  equation 


I  vr  1 2  =  n2, 


(2.5) 


where 


n  =  n(x,y,z)  = 


c(x,y,z) 


(2.6) 


is  the  index  of  refraction.  The  conditions  given  in  Equation  (2.4)  can  be  met  if 
the  amplitude  of  the  wave  and  the  speed  of  sound  do  "...not  change 
appreciably  in  distances  comparable  to  a  wavelength."  (Kinsler,  1982,  pp.  117- 
118) 


Solution  of  Equation  (2.5)  yields  the  ray  path  trajectories  that  the  acoustic 
energy  follows.  The  behaviour  of  VT  is  given  by 

^(VD  =  Vn,  (2.7) 

which,  if  the  sound  speed  is  a  function  of  depth  only  (i.e.,  c  =  c(z)),  leads  to  a 
form  of  Snell's  law 


cos  9 

=  constant,  (2.8) 

where  0  is  the  angle  of  the  ray  path  measured  from  the  horizontal  at  a 
particular  location  along  the  ray  path.  (Kinsler,  1982,  pp.  118-120) 
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B.  METHODS  FOR  SOLVING  THE  RAY  PATH 


1.  Ray  Equations 

By  the  repeated  application  of  Snell's  law  in  a  horizontally  stratified 
medium/  the  ray  path  can  be  determined  by 

COS  01  COS  02  COS  0n 

Cl  “  C2  "  cn  '  (29) 

where  the  subscripts  1,2, ...  n  denote  the  successive  layers  of  the  medium  (Clay, 
1977,  p.  83).  Again,  the  speed  of  sound  is  assumed  to  vary  only  with  depth. 
An  element  of  the  ray  path  at  a  point  P  is  shown  in  Figure  2.1.  The  quantities 
ds,  dz  and  dx  represent  the  differential  ray  path  length,  vertical  displacement 
and  horizontal  displacement  respectively. 


Figure  2.1  Element  of  Ray  Path  at  Point  P 


From  Figure  2.1  we  have 


ds2  =  dz2  +  dx2.  (2.10) 

Dividing  both  sides  by  dx  and  solving  for  the  differential  change  in  depth  (dz) 
with  respect  to  the  differential  change  in  range  (dx)  yields 
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fds'f  fdzY 

W  - (dxj  + 


Also  from  Figure  2.2  we  have  the  relation 

ds  1 
dx  "  cos  9  ' 

From  Snell's  law  we  have 

cos  9  cos  0 
o 


(2.11) 


(2.12) 


(2.13) 


(2.14) 


so  that 


cos  0  =  a  c(z). 


(2.15) 


where  a  is  the  Snell's  law  constant  of  the  ray  (Clay,  1977,  p.  84).  Substituting 
this  result  into  Equation  (2.12)  and  taking  the  positive  branch  of  the  square 
root  yields 


dz  _ 
dx  ~ 


V 


f  1  ¥ 

a  c(z) 


1, 


(2.16) 


the  differential  change  in  depth  with  respect  to  a  differential  change  in  range. 
The  differential  travel  time  along  the  ray  segment  is  given  by  (Clay,  1977, 
p.  84) 


dt  1 
ds  ~  c(z)  ' 

or 

_ _ dx _ 

dt  ~  c(z)  coc  0  ’ 


(2.17) 


(2.18) 
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These  relations  can  also  be  expressed  as  integrals  along  the  ray  from  an  initial 
range  Xj  to  a  final  range  Xf 


Zf  -  Zj 


xf 

r 


dx 


j 

Xi 


f  1  V 

^a  c(z)y 


-1 


(2.19) 


tf  -  ti  = 


Xf 

r 


dx 


c(z)  cos  0' 


(2.20) 


Although  the  speed  of  sound  is  a  continuous  function  of  depth,  a 
common  approximation  made  in  some  ray  tracing  implementations  is  to 
break  the  sound  speed  profile  into  linear  segments,  or  layers  as  shown  in 
Figure  2.2.  This  is  natural  since  the  sound  speed  is  usually  only  known  (i.e., 
measured)  at  discrete  depth  points. 


c  [m/s] 


Figure  2.2  Linear  Approximation  of  Sound  Speed  Profile 
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Note  that  the  thicker  curved  line  in  Figure  2.2  represents  the  actual  sound 
speed  profile;  the  thin  straight  lines  represent  the  linear  approximations  and 
the  dashed  lines  represent  the  layers. 

Once  this  linear  approximation  is  made  to  the  sound  speed  profile, 
the  ray  path  within  a  particular  layer  can  be  evaluated  analytically.  The  result 
is  that  the  ray  path  follows  the  arc  of  a  circle  whose  radius  is  given  by 


radius  = 


ag' 


(2.21) 


where  a  is  the  Snell'  s  law  constant  defined  in  Equation  (2.14)  and  g  is  the 
constant  sound  speed  gradient  (dc/dz)  within  the  layer  (Clay,  1977,  p.  87). 
Although  this  method  is  simplistic  and  suitable  for  hand  calculation,  it  can 
yield  unsatisfactory  results  due  to  discontinuities  in  dc/dz  at  the  interfaces 
(Pederson,  1961,  pp.  465-474). 


2.  Sound  Speed  Profile  Interpolation 

To  avoid  the  problems  associated  with  using  linearly  segmented 
sound  speed  profiles,  a  method  of  curve  fitting  the  sound  speed  data  was 
chosen.  The  cubic  spline  method  was  used  for  its  simplicity  and  satisfactory 
results  in  smoothly  interpolating  the  sound  speed  profile  data.  The  cubic 
spline  involves  approximating  a  curve  by  fitting  a  third-degree  polynomial 
between  each  pair  of  points.  The  polynomials  are  computed  so  that  they  pass 
through  the  given  data  points  and  are  twice  differentiable  (Moler,  1970, 
p.  740).  The  data  points  are  not  required  to  be  equally  spaced.  Because  the 
resultant  curve  is  an  interpolation  rather  than  a  smoothing  of  the  data  points, 
the  data  points  are  assumed  to  be  free  of  significant  measurement  errors. 

To  compute  the  spline  coefficients,  it  is  assumed  that  the  sound 
speed  data  is  given  as  a  set  of  n  points  such  that  q  =  c(zj),  i  =  0,1,.. .,n.  The 
spline  coefficients  aj,  i  =  l,2,...,n-l,  are  computed  once  for  a  given  data  set  by 
solving  the  n-1  simultaneous  linear  equations 
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Azj  a;.!  +  2(Azj  +  A zi+1)  a;  +  Azi+i  ai+1  =  Aci+1  -  Aq  ,  i  =  1,2,.. .,11-1, 


(2.22) 


where 


Azj  —  Zj  ■  Zj.-|, 


(2.23) 


Ac,  = 


Cj  -  Cm 
Azj 


,  i  =  1,2,.. ,,n. 


(2.24) 


and  a0  and  an  are  assumed  to  equal  zero.  The  n-1  linear  equations  given  in 
Equation  (2.22)  are  tridiagonal1  in  form  and  can  be  solved  easily  using  special 
tridiagonal  matrix  algorithms  (Gerald,  1985,  pp.  146-147).  Once  the  spline 
coefficients  are  known,  the  following  values  are  easily  computed  for  a  given 
value  of  z  in  the  subinterval  z\.\  <  z  <  Zj  (Moler,  1970,  p.  740) 


c(z)  =  wcj-i  +  wq  +  (AZj)2  [aj_i(w3  -  w)  +  a*(w3  -  w)],  (2.25) 
c’(z)  =  Aq  +  Az;  [  -dj-i(3  w2  - 1)  +  ajOw2  - 1)],  (2.26) 

c"(z)  =  6  (waj_i  +  wa* ),  (2.27) 


where 


and 


w  = 


Z  -  Zj.i 
AZi 


w  =  1  -  w, 


(2.28) 


(2.29) 


and  c(z),  c'(z)  and  c”(z)  represent  the  speed  of  sound  and  its  first  and  second 
derivatives  respectively. 


1 


Tridiagonal  matrices  have  non-zero  elements  only  on  the  diagonal  and  immediately 
adjacent  to  the  diagonal. 
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3.  Solution  of  the  Ray  Equations 

With  the  ability  to  estimate  the  sound  speed  at  all  depths  through 
the  use  of  cubic  splines,  the  problem  of  determining  the  ray  path  is  a  matter  of 
numerically  solving  Equations  (2.16)  and  (2.17)  or  equivalently.  Equations 
(2.19)  and  (2.20).  Numerical  integration,  at  a  most  basic  level,  involves 
stepping  the  solution  in  the  independent  variable  direction  (dx)  and  solving 
for  the  subsequent  change  in  the  dependent  variable  (dz  and  dt)  as  specified  by 
the  numerical  integration  method  being  used.  Equations  (2.16)  and  (2.17) 
could  easily  be  solved  using  an  existing  numerical  integration  package  (e.g., 
IMSL).  Since  the  target  processor  for  this  ray  tracing  algorithm  was  the 
transputer,  it  was  necessary  to  write  a  new  numerical  integration  algorithm. 
This  did  however  give  the  opportunity  to  write  an  application-specific 
algorithm  that  would  not  require  as  much  overhead  as  perhaps  a  general 
purpose  one.  Also,  it  gave  an  opportunity  to  investigate  different  numerical 
integration  algorithms  and  determine  which  was  most  suitable  to  the  ray 
tracing  problem.  A  comparison  of  four  numerical  integration  algorithms  is 
given  in  Chapter  III. 
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III.  NUMERICAL  INTEGRATION  METHODS  COMPARISON 


A  number  of  numerical  integration  algorithms  were  examined  and 
evaluated  to  see  which  would  be  best  suited  to  solving  Equation  (2.9).  They 
were  tested  on  the  basis  of  their  speed,  accuracy  and  ease  of  implementation. 
Each  method  was  evaluated  using  only  a  simple  scenario  -  a  bilinear  sound 
speed  profile,  a  fixed  source  depth  and  receiver  range.  This  simple  scenario 
meant  that  full  ray  tracing  algorithms,  able  to  handle  all  conditions  (i.e., 
surface  and  bottom  reflections)  were  not  required  for  each  method. 

A.  NUMERICAL  INTEGRATION  METHODS 

1.  General 

Four  numerical  integration  methods  were  evaluated,  consisting  of 
two  basic  types  -  single  step  (self-starting)  and  multistep  (predictor-corrector) 
types.  Single  step  methods  are  self-starting  since  they  only  use  information 
from  the  previous  step.  Therefore,  to  start  they  only  require  the  initial 
conditions  of  the  problem  and  they  are  able  to  use  different  step  sizes 
throughout  the  integration.  Multistep  methods  take  advantage  of  the 
information  computed  in  multiple  past  steps.  Therefore,  a  set  of  initial 
conditions  is  not  sufficient  to  start  them.  They  must  be  started  using  single 
step  methods  until  enough  past  values  are  available  so  they  may  continue  on 
their  own.  (Gerald,  1985,  p.  312) 

The  order  of  a  numerical  method  is  related  to  the  amount  of 
accumulated  (global)  error  Ej[h]  or  the  per-step  truncation  (local)  error  ej[h]  at 
a  particular  step  j,  where  h  represents  the  integration  step  size.  If  Ej[h]  is  0(hn) 
then  ej[h]  is  0(hn+1),  and  the  method  is  considered  to  be  n-th  order  (Maron, 
1982,  pp.  340-341).  The  four  methods  evaluated  are  all  fourth-order  methods. 
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To  solve  Equation  (2.16)  numerically,  the  solution  or  ray  path,  is 
started  at  the  source  depth  and  range  (z0/x0)  and  initial  take-off  angle  (e0).  The 
ray  path  is  then  stepped  in  range  (x)  and  an  estimate  of  the  new  depth  (z,+i)  is 
computed  by  numerical  integration.  This  process  is  repeated  until  the 
receiver  range  (xr)  has  been  reached.  Note  that,  in  the  following  discussions 
of  the  integration  methods,  the  function  /  refers  to  Equation  (2.16). 

2.  Runge-Kutta  with  Fixed  Step  Size 

The  fourth-order  Runge-Kutta  (RK)  method  (Gerald,  1985,  p.  308)  is 
a  single  step  method  which  requires  four  function  evaluations,  or  sample 
slopes,  per  iteration.  A  common  Runge-Kutta  formula,  as  apy  ied  to  the  ray 
tracing  problem,  is 


where 


Zj+1  —  Zj  +  6  (ki  +  2(k2  +  k3)  +  1<4  J  , 

(3.1) 

ki  =  /( Zj ,  xj), 

k2  =  /(Zj+  2hkl(  Xj  +  5  h), 

(3.1.1) 

(3.1.2) 

k3  =  /( Zj  +  \  hk2,  Xj  +  \  h),  and 

(3.1.3) 

k4  =  /( Zj  +  hk3,  Xj  +  h). 

(3.1.4) 

The  above  equations  for  ki,...,k4  represent  the  sample  slopes  for  various 
values  of  z  and  x,  that  is  /( z,x).  The  term  h  refers  to  the  step  size  or  dx,  the 
differential  change  in  range.  In  the  case  where  the  speed  of  sound  varies  only 
with  depth  (c  =  c(z)),  the  function  /  is  only  dependent  on  depth.  Therefore  the 
range  values  are  not  required  and  the  sample  slope  equations  simplify  to 


ki  =  /(zj ), 
k2  =  /(zj  +  5hk1)/ 

(3.1.5) 

(3.1.6) 

k3  =  /( Zj  +  \  hk2),  and 

(3.1.7) 

k4  =  /( Zj  +  hk3). 

(3.1.8) 
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These  sample  slopes  are  illustrated  in  Figure  ?.l.  The  z  coordinates  of  the 
sample  points  Pi,... ^4  correspond  to  z,,  Zj  +  \  hkj,  Zj  +  5  hk2  and  Zj  +  hk3 
respectively.  The  sample  slopes  ki,...,k4  are  the  values  of  /  at  the  sample 
points  Pi,...,P4. 


It  is  possible  to  check  the  accuracy  of  the  RK  solution  by  computing 
a  second  estimate  of  the  function  with  the  step  size  h  reduced  by  one-half. 
The  two  solutions  can  be  compared  and  if  a  significant  difference  exists,  the 
integration  can  be  continued  using  the  reduced  step  size.  This  approach 
however  requires  the  calculation  of  eight  more  function  evaluations  per  step 
and  represents  much  more  programming  effort  to  implement.  An 
alternative  to  this  approach  is  to  compute  more  than  four  sample  slopes  at 
each  step  and  use  this  extra  information  to  yield  a  second  estimate  of  the 
solution  Zj+i.  The  second  estimate  of  z  can  be  used  to  adjust  the  step  size  up 
or  down  depending  on  whether  the  quantity  dz/dx  varies  slowly  or  rapidly 
(i.e.,  at  turning  points)  with  range.  One  such  method  that  employs  this 
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technique  is  the  Runge-Kutta-Fehlberg  algorithm.  (Maron,  1982,  pp.  348  351) 
(Gerald,  1985,  pp.  309-310) 


3.  Runge-Kutta-Fehlberg 

The  Runge-Kutta-Fehlberg  (RKF)  method  is  a  single-step  method 
with  variable  step  size.  The  RKF  method  uses  six  function  evaluations  per 
step  and  is  capable  of  estimating  the  error  at  every  step  using  this  extra 
information.  By  comparing  the  error  estimate  to  a  fixed  error  tolerance 
value,  the  step  size  is  adjusted  up  or  down  accordingly  after  each  step. 
Although  it  requires  two  more  sample  slopes  than  the  simple  Runge-Kutta 
method,  the  RKF  method  is  more  accurate  and  can  be  more  efficient  because 
of  the  step  size  control.  This  is  illustrated  later  in  Section  B  -  "Test  Results." 

The  RKF  equations  (Maron,  1982,  pp.  350-351),  again  for  the  range 
independent  case  (c  =  c(z)),  are 


25_.  1408  2197  1 

:j+l  -  zj  +  216  kl  +  2565  *3  +  4104  M  -  5  k5/ 


(3.2) 


where 


„  _  .  1  ,  128  .  2097  ,  I  ,  2  , 

Error  Estimate  -  36q  kl  -  4275  k3 '  75240  ^4  +  50  ^5  +  55  ^6, 


k,  =  h/(z,). 


(3.2.1) 


(3.2.2) 


k2  = 

=  h/(zj  + 

4  ki). 

(3.2.3) 

k3: 

=  h/(zj  + 

32  ki  + 

32  k2). 

(3.2.4) 

=  h/(zj  + 

1932 

7200 

7296 

4* 

2197  kl 

-2197  k2  + 

2197 

k3), 

(3.2.5) 

k5  - 

=  h/(Zj  + 

439 

216  kl  - 

OI  3680 

8k2  +  TIT 

1 

k3  - 

845 

4104 

k4),  and 

(3.2.6) 

k6  = 

=  h/(  zi  - 

8  , 

27  kl  + 

Ob  3544 

Zk2  +  2565 

k3  + 

1859 

4104 

k,  -  55  ks). 

(3.2.7) 

The  integration  step  size  adjustment  is  performed  in  the  following 
manner 


After  computing  the  ki,...,k6  values,  the  error  estimate  (Equation 
3.2.1)  is  computed.  If  the  ratio  of  the  error  estimate  divided  by  the 
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step  si?.e  is  less  than  the  error  tolerance  value,  then  zJ+1  is  computed 
using  Equation  (3.2)  and  the  range  is  incremented  by  the  step  size  h. 

•  A  step-size  scale  factor  is  then  computed,  regardless  of  whether  Zj+j 

is  computed  or  not,  as  follows 

,  ,  L  .1/4 

_  _  .  /  tolerance  x  h  \ 

Scale  factor  =  0.84  terror  estimate/  '  (3  3) 

•  The  step  size  is  then  scaled  (h  =  Scale  factor  x  h)  and  the  k  values 
computed  for  the  next  iteration.  Two  fixed  values  representing 
maximum  and  minimum  scale  factors  are  also  used  to  control  how 
fast  the  step  size  can  vary  at  any  one  time. 

The  error  estimate.  Equation  (3.2.1),  is  approximately  equal  to  the  per-step 
truncation  error,  namelv  ej+i[h].  (Maron,  1982,  p.  351) 

4.  Adams-Bashforth-Moulton  with  Fixed  Step  Size 

The  Adams-Dashforth-Moulton  (ABM)  method  is  a  multistep  or 
predictor  corrector  method.  It  uses  four  previously  computed  values  to 
compute  a  new  one  and  is  therefore  not  capable  of  self-starting.  The  ABM 
equations  are  (Maron,  1982,  pp.  354-356): 

the  Adams-Bashforth  predictor 

Pj+i  =  Zj  +  £  {-9/1-3  +  37/j_2  -  59/j-!  +  55/,},  (3.4) 

the  Adams-Moulton  corrector 

Cj+i  =  Zj  +  5*  {/j-2  -  5/j-i  +  19/j  +  9/^i(p^i)} ,  (3.5) 

and  the  error  estimate 

_  19 

Oj+1  -  ‘  270  '  Pj+i'-  ^.6) 
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The  error  estimate  8j+i  is  a  measure  of  the  local  error  ej+i(h]  and  can  be  used  to 
determine  whether  the  corrector  value  Cj+1  is  accurate  enough.  If  not,  it  can  be 
recomputed  using  another  function  estimate,  namely  the  function  evaluated 
using  the  corrector  value  (i.e.,  /(cj+i».  Although  this  method  is  not  self¬ 
starting,  since  the  /j.3,...,/j  are  not  known,  it  only  requires  two  or  three 
function  evaluations  per  iteration.  A  common  method  of  starting  (or 
restarting)  is  to  use  the  fixed-step-size,  fourth-order  Runge-Kutta  method, 
slightly  modified  to  save  the  function  values  /j_3,...,/j  in  addition  to  the 
Zj_3,...,Zj  computed  values.  (Maron,  1982,  pp.  355-357) 

5.  Adams-Bashforth-Moulton  with  Variable  Step  Size 

This  is  the  same  method  described  above  with  the  exception  that 
the  step  size  h  can  be  adjusted  in  one  of  two  ways  based  on  the  error  estimate 
5)+i.  The  first  is  to  scale  the  step  size  in  the  same  fashion  as  outlined  in  the 
Runge-Kutta-Fehlberg  method.  However,  once  the  step  size  is  scaled  the 
integration  must  be  restarted.  The  second  approach  is  to  double  or  halve  h 
only.  If  the  error  is  unacceptable,  then  the  step  can  be  halved  by  using  the 
following  equations  (interpolating  quartics)  to  compute  bisecting  values 

/j-l/2  =  128  {"5/j-4  +  28/j_3  -  70/, 2  +  140/j-i  +  35/j  },  (3.7) 

/j-3/2  =  64  {3/j-4  -  16/j- 3  +  54/j_2  +  24/j.j  -  /j  }  -  (3.8) 

If  the  error  is  acceptable  then  the  step  size  can  be  doubled  by  using  every 
second  function  value.  These  step  adjustments  are  depicted  in  Figure  3.2.  For 
a  normal  step  the  new  value,  f\  ,  is  computed  using  the  four  previous  values 
/_3,...,/i-  When  the  error  is  considered  too  large,  the  new  value,  j\a  ,  is 
computed  using  /_ 3/2,  /_ \,  /_  1/2  and  /o  and  the  step  size  is  reduced  by  one-half. 
When  the  error  is  acceptable,  a  new  step,  /2  ,  is  computed  using  /_ 6,  /-4,  /_ 2 
and  /o  ,  thus  doubling  the  step  size.  Note  that  doubling  the  step  size  requires 
that  seven  previous  function  values  must  be  stored.  (Maron,  1982,  pp.  357- 
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359)  This  method  will  be  referred  to  as  ABMQ,  where  the  "Q"  represents  the 
use  of  the  interpolating  quartic  equations. 


Normal  Step  t 


Half  Step 


Double  Step 


w - ■ — w - » - w  ; : - w 

Li  f-  3  f-2  f- 1  /o  :  f\ 

k 

S  too  large  here 

f-2 

/„  1  / 

f-  3/2  f-l/2  f\t 2 

1 

8  small  enough  here 

. 1  / 

\U  is  \U 

U  'f-2  it  \i\  1  X 

i 

Figure  3.2  Adams-Bashforth-Moulton  Step  Size  Adjustment 


B.  ALGORITHM  TESTING  AND  RESULTS 

1.  Test  Scenario 

The  numerical  integration  methods  decribed  previously  were  tested 
using  a  simple  bilinear  sound  speed  profile  as  shown  in  Figure  3.3.  The  test 
sound  speed  profile  consists  of  one  gradient  of  -0.05  s-1  from  the  surface  down 
to  400  meters  and  another  of  +0.05  s'* 1  from  400  meters  to  800  meters.  A 
source  depth  of  200  meters  was  chosen  so  that  a  ray  leaving  the  source  at  0° 
would  be  channeled  between  200  and  600  meters  and  would  not  interact  with 
the  surface  or  the  bottom.  This  reliable  acoustic  path  (RAP)  simplified  the 
support  algorithms  required  to  conduct  the  tests.  The  ray  path  ic  also  depicted 
in  Figure  3.3. 
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C  (m/sj 


range [m] 

Figure  3.3  Sound  Speed  Profile  and  Ray  Path 


All  rays  were  traced  out  to  a  range  of  100  kilometers  which  was 
considered  adequate  to  reveal  round-off  error  accumulation  effects.  A 
function  to  compute  the  analytic  solution  of  this  ray  path  was  also  used  to 
compare  the  results  (i.e.,  the  depth)  at  any  given  range  value.  The  spline 
interpolation  method  was  not  used  in  this  case  so  that  an  analytic  solution 
would  be  available  for  comparison. 

To  determine  which  numerical  integration  method  was  the  most 
efficient  in  terms  of  speed  and  accuracy,  a  maximum  allowable  depth  error 
was  first  chosen.  Each  algorithm  was  then  run  several  times  to  determine 
which  parameters  (e.g.,  step  size,  error  tolerance)  were  required  to  satisfy  the 
given  allowable  error.  A  maximum  error  was  chosen  because  the  error 
tended  to  oscillate  with  respect  to  range  as  illustrated  in  Figure  3.4.  This 
graph  shows  the  absolute  value  of  the  error  versus  range  for  the  Runge  Kutta 
algorithm  with  a  fixed  step  size  of  200  m.  It  is  clear  from  this  graph  that  to 
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simply  choose  the  error  at  a  particular  range  (e.g.,  receiver  range  xr)  may  give 
misleading  results. 


15 
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Figure  3.4  Example  Graph  of  I  Error  I  vs.  Range  for  Runge  Kutta 
with  Step  Size  =  200m 


2.  Test  Results 

The  maximum  allowable  depth  error  was  chosen  to  be  ±5  m.  The 
parameters  required  by  each  method  are  given  in  Table  3.1  and  the 
normalized  timing  results  are  shown  in  Figure  3.5.  It  is  clear  from  Figure  3.5 
that  the  RKF  method  is  the  most  efficient,  with  an  eight-to-one  speed 
advantage  over  the  next  closest  method  (RK). 


21 


TABLE  3.1 


REQUIRED  INTEGRATION  PARAMETERS 


Method 

Step  Size  [m] 

Error  Tolerance 

Maximum  1  Error  1  [m] 

RK 

45 

- 

4.86 

RKF 

- 

10'5 

4.94 

ABM 

30 

- 

4.98 

ABMQ 

- 

10"8 

4.98 

Integration 

Method 

RKF 


RK 


ABMQ 


ABM 


01  234  56789  10  11  12 

Normalized  Time 


Figure  3.5  Normalized  Timing  Results  for  Maximum  Error  ±5m 


The  advantages  of  a  variable-step  method  versus  a  fixed-step 
method,  with  regard  to  the  ability  to  vary  the  step  size  and  thus  control  the 
accuracy,  is  illustrated  in  Figure  3.6.  This  figure  shows  the  range  steps  taken 
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by  the  RKF  and  RK  methods  out  to  a  range  of  5  km.  The  data  used  in  this 
graph  is  taken  from  the  test  results  described  previously;  therefore  the  RK 
step  size  is  45m  and  the  RKF  error  tolerance  is  10~3 * 5.  Evident  from  Figure  3.6  is 
the  ability  of  the  RKF  method  to  compute  the  correct  ray  path  but  with  a  far 
greater  efficiency. 


range  [m] 


Figure  3.6  Range  Steps  Used  by  RKF  and  RK  Methods 


3.  Algorithm  Implementation 

Before  the  integration  algorithms  were  tested  as  previously 
described,  they  were  implemented  and  tested  using  standard  examples  found 
in  the  numerical  methods  texts  (e.g.,  Maron,  1982,  pp.  348-356).  Once  verified, 

the  algorithms  were  modified  as  required  so  that  Equation  (2.9)  could  be 
solved  specifically.  Although  these  modifications  were  not  extensive,  it  is 
possible  that  errors  were  introduced  at  this  time.  This  may  be  the  case  with 
the  Adams-Bashforth-Moulton  methods.  The  results  of  the  ABM  methods 
indicate  that  either  they  are  not  well  suited  to  the  ray  tracing  problem  or  else 
they  were  not  implemented  properly.  In  either  case,  the  ABM  methods  were 
much  more  difficult  to  implement  than  the  RK  and  RKF  methods.  Had  the 
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ABM  or  ABMQ  results  been  much  closer  to  the  RKF  results,  the  RKF  method 
would  still  have  been  favored  for  its  much  greater  ease  of  implementation . 

C.  DISCUSSION 

From  the  results  presented,  the  choice  of  which  numerical  integration 
algorithm  to  use  for  the  ray  tracing  problem  was  quite  easy.  The  Runge- 
Kutta-Fehlberg  method  was  by  far  the  superior  method.  By  employing  two 
extra  equations,  compared  to  the  regular  RK  method,  valuable  error 
information  can  be  obtained  and  used  to  easily  adjust  the  integration  step 
size.  It  is  the  ability  to  adjust  the  step  size  that  leads  to  a  much  greater 
efficiency  while  maintaining  the  desired  accuracy.  The  RKF  algorithm  was 
very  similar  to  the  RK  with  respect  to  implementation.  It  could  therefore 
easily  replace  an  existing  RK  code  without  significant  modifications.  Both  the 
RK  and  RKF  methods  were  very  easy  to  code  when  compared  to  the  ABM 
and  ABMQ  methods. 

The  ease  of  implementation  is  an  important  issue  since  the  ray  tracing 
codes  used  in  this  test  only  provide  minimum  capabilities.  The  full  ray 
tracing  code  is  much  more  complicated.  For  example,  surface  and  bottom 
interactions  must  be  handled  and  more  than  one  equation  must  be  integrated 
at  the  same  time.  The  testing  was  meant  to  quickly  determine  the  most 
suitable  numerical  integration  algorithm  for  a  simple  test  problem. 
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IV.  GAUSSIAN  BEAM  TRACING 


A.  BACKGROUND 

1.  Intensity  Calculations 

A  common  method  of  estimating  the  intensity  along  a  particular 
ray  path  is  to  assume  that  intensity  changes  are  strictly  due  to  spreading  loss 
and  that  all  acoustic  energy  transmitted  between  two  adjacent  rays  remains 
between  those  rays.  The  change  in  intensity  is  therefore  assumed 
proportional  to  a  change  in  area.  The  geometry  used  in  this  approximation  is 
illustrated  in  Figure  4.1.  This  leads  to  an  equation  for  the  ratio  of  the 
intensities,  or  the  focusing  factor  / 

,  I(x)  X  cose0  1 

f  =  i„  =  iisr  i|rr ■  ,4,) 

ae0 

where  Iq  and  I(x)  represent  the  intensities  at  the  source  (z0,x0)  and  at  a  point 
(z,x)  along  the  ray  path  respectively.  (Brekhovskikh,  1982  p.  40) 

A  problem  occurs  with  this  approach  when  rays  cross,  as  they  do  at 
focusing  or  convergence  zones  (Clay,  1977,  p.  93).  As  rays  converge,  the  term 
(dx/380)  approaches  zero  and  the  focusing  factor  approaches  infinity.  The 
envelopes  of  the  focusing  points,  where  /  equals  infinity,  are  also  referred  to 
as  caustics.  In  reality,  the  intensity  increases  sharply  at  caustics  but 
nevertheless,  it  remains  a  finite  quantity.  In  order  to  calculate  the  focusing 
factor  in  this  case,  modifications  to  the  ray  theory  must  be  made. 
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A  formula  for  the  focusing  factor  on  or  near  a  caustic  is  derived  in 
Brekhovskikh  (1980,  Sect.  45).  The  result,  as  given  in  Brekhovskikh  (1982,  pp. 
41-42),  is 


,  COS0O  (ko  sin60)1/3 

f  =  2  .  7TT7 - X 


sin0 


a2x 


00o 


-2/3 


t>2(0. 


(4.2) 


Figure  4.1  Ray  Geometry  For  Intensity  Calculation 


where  k0  is  the  wave  number  (k0  =  o>/c0)  and  v(t)  is  the  Airy  function.  The 
argument  f  of  the  Airy  function  is 


t  =±2m 


-1/3 

(ko  sinOo)2'3  (x  -  Xo), 


(4.3) 


where  the  plus  sign  is  chosen  if  O2x/30o2)  <  0  and  the  minus  sign  is  chosen  if 
0  x/a0o  )  >  0.  The  condition  t  <  0  corresponds  to  the  interference  between 
rays  and  thus  "spatial  oscillations"  occur  (Brekhovskikh,  1982,  p.  42).  When 
t  >  0,  no  rays  are  present  and  the  acoustic  field  decreases  rapidly,  as  it  does  at 
shadow  zones.  Shadow  zones  are  regions  where  no  rays  penetrate;  therefore 
the  intensity  is  assumed  zero.  Again  this  is  an  incorrect  assumption  since 
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sound  does  indeed  penetrate  shadow  zones  due  to  scattering,  internal  waves 
and  diffraction  effects  (Kinsler,  1982,  p.  403). 

2.  Eigenrays 

In  order  to  estimate  the  multipath  arrivals  at  a  given  receiver 
location  using  ray  tracing  methods,  it  is  necessary  to  determine  all  ray  paths 
which  pass  through  the  receiver  location.  Rays  that  travel  from  the  source 
and  pass  through  the  receiver  location  are  referred  to  as  eigenrays.  Since  the 
ray  paths  are  infinitesimally  thin,  it  is  reasonable  to  set  some  boundary  limits 
at  the  receiver  location.  That  is,  if  a  ray  passes  within  a  specified  distance  (e.g., 
±  5  m)  from  the  receiver,  it  is  considered  an  eigenray.  Even  with  boundary 
limits  set,  finding  all  the  eigenrays  represents  a  formidable  problem  since  a 
multitude  of  rays  must  be  traced  in  order  to  effectively  saturate  the  receiver 
location  with  rays.  If  a  full  three-dimensional  ray  trace  model  (i.e.,  c  =  c(x,y,z) 
and  varying  bottom  bathymetry)  is  used,  the  problem  becomes  very  difficult, 
computationally  intensive  and  can  be  prone  to  errors  (Porter,  1987,  p.  1355). 

B.  GAUSSIAN  BEAMS 

1.  General 

The  Gaussian  beam  (GB)  tracing  method  involves  associating 
"...with  each  ray  a  beam  with  a  Gaussian  intensity  profile  normal  to  the  ray.” 
(Porter,  1987,  p.  1349)  The  GB  contribution  at  a  particular  point  can  be  used  to 
scale  other  quantities  such  as  intensity  or  pressure.  The  problems  associated 
with  some  ray  tracing  methods,  as  described  above,  are  not  present  in  the  GB 
method.  The  energy  at  focusing  points  is  finite  and  shadow  zones  do  contain 
some  sound  energy.  The  GB  method  also  eliminates  the  need  to  perform 
eigenray  tracing  to  compute  multipath  arrivals. 
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2.  Gaussian  Beam  Equations 

The  GB  method  basically  involves  solving  a  set  of  ordinary 
differential  equations  along  each  ray  path.  These  equations,  which  are 
derived  from  a  parabolic  equation  solution  in  the  vicinity  of  each  ray,  are 
derived  in  Cerveny  (1982,  pp.  109-113)  and  Porter  (1987,  pp.  1356-1357).  They 
are  related  to  the  width  and  curvature  of  the  beam  associated  with  a  particular 
ray  path.  The  ray  path  in  this  case  describes  the  central  axis  of  the  associated 
beam.  These  equations  are  coupled  ordinary  differential  equations  and  can  be 
solved  numerically,  along  with  the  ray  equation  (Equation  2.9),  by  the 
methods  described  in  Chapter  II.  The  equations,  in  terms  of  the  arc  length  (s) 
along  the  ray,  are 


dq 

ds 


=  c(s)  p(s) , 


(4.4) 


and 


(4.5) 


where  cnn  is  the  second  derivative  of  the  sound  speed,  in  a  direction  normal 
to  the  ray.  Note  that  the  functions  p(s)  and  q(s)  are  also  complex  quantities, 
that  is,  p(s)  =  pj(s)  +  ip2(s)  and  q(s)  =  qi(s)  +  iq2(s). 

The  functions  p(s)  and  q(s)  can  be  related  to  the  beamwidth  L(s)  and 
curvature  K(s)  as  follows 


L(s)  = 


(4.6) 


K(s)  =  -  c(s)  %$. 


p(s)l 
q(s)  J 


(4.7) 
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where  Im[ >  and  %e{]  denote  the  imaginary  and  real  parts  of  the  complex 
argument.  The  beamwidth  L(s)  is  actually  the  effective  beam  radius,  or  the 


Figure  4.2  Gaussian  Distribution  Normal  to  Ray  Path 


width  normal  to  the  ray  at  which  the  beam  amplitude  is  e-1  of  its  maximum 
value.  (Porter,  1987,  p.  1350)  This  is  illustrated  in  Figure  4.2,  which  shows  the 
Gaussian  distribution  in  a  direction  normal  to  the  ray  path  (Cerveny,  1982, 
p.  115). 


Once  the  ray  path  and  the  functions  p(s)  and  q(s)  are  solved,  the 
beam  contribution  can  be  computed  at  any  point  (s,n)J  as  follows 


u 


beam 


(s,n) 


"AV 


c(s)  l-ici)  (t(s)  +  0.5  (p(s)/q(s)}  n2)] 


xq(s) 


(4.8) 


1  The  coordinate  (s,n)  represents  a  point  in  a  ray-centered  coordinate  system. 
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where  A  is  an  arbitrary  constant,  t(s)  is  the  travel  time  along  the  ray  path  and 
n  is  the  distance  normal  to  the  ray  path.  The  square  root  of  Equation  (4.8)  is 
defined  as 


VC(S)  _  .  .m(s) 

x  q(s)  ~('V 

where  m(s)  is  the  number  of  times  that  q(s)  crosses  the  imaginary  axis.  (Porter, 
1987,  p.  1350) 

The  expansion  of  a  point  source  into  beams  is  given  in  Porter  (1987, 
p.  1351).  This  is  necessary  because  a  point  source  is  not  "beamlike"  and  must 
therefore  be  approximated  by  a  superposition  of  beams.  The  final  result  of  the 
point  source  expansion  is  the  following  expression  for  the  beam  field 


c(s) 


x  q(s) 


(4.9) 


u(s,n) 


|  86 


U0O) 


beam 


(4.10) 


where  6e  is  the  angular  spacing  between  the  beams  (rays)  and  uq0  boam  is  the 
beam  with  an  initial  beam  angle,  or  take-off  angle,  of  60j-  The  beam  field  at  a 
point  (s,n)  is,  therefore,  the  sum  of  all  beam  contributions  at  the  same  point, 
multiplied  by  a  beam  constant. 


Tue  beam  equations  so  far  have  been  presented  in  ray-centered 
coordinates.  In  order  to  avoid  conversions  from  cylindrical  to  ray-centered 
coordinates,  an  expression  for  the  beam  field  in  cylindrical  coordinates  is 
used.  The  beam  field  in  cylindrical  coordinates  u(z,x)  is  equal  to  u(s,n)  as 
given  in  Equation  (4.10),  with  the  exception  of  ue0beam  which  is  redefined  as 


ue0jt’cam  <2 


!'*>  -  a 


p(x) 


C^x)  exP  l  'ico  1  +  cfe tzAz  +  0  5  (Az)2 


(0-5  q(xjtr'  +  2c(z)ltztr 


c(z) 


?tzz)}]. 


(4.11) 
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where  (tz,tr)  is  the  local  tangent  vector  to  the  ray  path  and  cn  and  cs  are  the  first 
derivatives  of  the  sound  speed  in  directions  normal  to  and  tangent  to  the  ray 
path,  respectively.  The  local  tangent  vector  (tz,tr)  is  simply  equal  to  (cos0,sine). 
The  quantity  Az  is  defined  as  the  distance  in  the  z  direction  between  the 
receiver  depth  (zr)  and  the  ray  path  at  (z,x).  (Porter  ,  1989,  p.  2) 

3.  Initial  Conditions 

The  initial  conditions  for  the  ray  equation  are  given  by  the  source 
position  (z0,x0)  and  initial  ray  angle  0O.  The  initial  conditions  for  the 
functions  p(x)  and  q(x),  however,  are  still  an  area  of  current  research.  Some 
authors  (Cerveny,  1982)  suggest  choosing  initial  conditions  so  as  to  minimize 
the  beamwidth  at  the  receiver.  This  leads  to  a  minimum  number  of  beams 
needed  to  describe  the  field  but  requires  different  initial  constants  for  each 
beam,  which  may  invalidate  Equation  (4.8)  (Porter,  1987,  p.  1350).  Madariaga 
(1984)  suggests  initial  conditions  that  closely  follow  WKB  theory.  This 
approach,  however,  has  p(x)  caustics  where  the  beam  reduces  to  a  point 
(Madariaga  ,1984,  p.  596). 

The  approach  used  in  this  thesis  is  the  one  suggested  by  Porter 
(1987).  They  choose  initial  conditions  so  that  the  beams  are  initially  flat 
(i.e.  K(0)  =  0)  and  in  the  far  field  the  beams  are  "space  filling."  Their  initial 
conditions  are  defined  as 


p(0)  =  1  ,  and  q(0)  =  ie, 

where  e  is  defined  as 

2cp2 

6  to  (50)2 ' 


(4.12) 


(4.13) 
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4.  Beam  Contribution  Computation 

To  compute  the  multipath  arrival  times  and  intensities  at  a  given 
receiver  location,  a  fan  of  beams  (rays)  is  traced.  The  contribution  of  each 
beam  at  the  receiver  location  is  then  computed.  These  beam  contributions 
can  then  be  used  to  scale  some  useful  quantity  such  as  the  acoustic  pressure  or 
intensity.  To  determine  the  beam  contribution,  it  is  necessary  to  first 
determine  the  ray  path  segments  (ds)  whose  normals  bracket  the  receiver 
point  at  the  receiver  depth.  This  concept  is  illustrated  in  Figure  4.3. 


Figure  4.3  Ray  Path  Segments  and  Normal  Intercepts 


Figure  4.3  shows  that  the  ray  path  segments  at  (Zi,Xj)  and  (Zj+i,Xj+i), 
with  angles  0,  and  0;+],  have  normals  that  bracket  the  receiver  location.  To 
compute  the  intercept  (xjnt)  of  these  normals  and  the  receiver  depth  line 
(horizontal  dashed  line  in  Figure  4.3),  the  following  formula  is  used 
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sin8j 

Xint  -  Xi  +  (Zr  -  Zi)  cos0.  . 


(4.14) 


The  ray  path  segments  at  (Zj,Xj)  and  (Zj+^Xj+i),  therefore,  have  corresponding 
normal  intercepts  at  xa  and  Xb  as  shown.  The  following  relation  yields  the 
proportional  distance  of  the  receiver  range  (xr)  between  the  points  xa  and  Xb 

w  =  (xr  -  xa)/(xb  -  xa).  (4.15) 

The  contribution  at  the  receiver  location  can  then  be  approximated  by  linearly 
interpolating  the  necessary  quantities  z,  p(x),  q(x)  and  t(x)  along  the  ray  path 
segment,  between  the  points  (Zj,Xj)  and  (zj+i,x;+i),  by  the  same  amount.  For 
example,  the  travel  time  t(x)  used  in  Equation  (4.11)  would  be  computed  as 
t(x)  =  t(xj)  +  w{t(xi+i)  -  t(xj)}.  (Porter,  1987,  pp.  1352) 

To  compute  the  beam  field  over  a  large  area,  a  matrix  or  grid  of 
receiver  points  is  used.  The  methods  described  above  are  valid  except  that 
there  may  be  several  receiver  locations  bounded  by  the  ray  path  normals 
instead  of  just  one.  Also  there  may  be  several  receiver  depths  and,  therefore, 
multiple  ray  normal  intercepts.  If  a  continuous  wave  (CW)  source  is 
assumed  then  the  contributions  at  a  particular  receiver  location  are  summed 
over  all  beams  as  indicated  by  Equation  (4.10).  The  results  given  in  Porter 
(1987)  have  been  computed  in  this  manner.  If  an  impulsive  source  is 
assumed,  then  the  contributions  are  normally  kept  in  discrete  form,  that  is, 
the  contribution  information  computed  at  a  particular  receiver  is  not 
summed  so  that  arrival  time  information  is  preserved. 

5.  Reflection  at  Boundaries 

In  Porter  (1987)  only  reflections  at  the  sea  surface  are  considered.  A 
beam  undergoes  a  change  in  curvature  (K(s))  but  no  change  in  width  (L(s)) 
after  a  surface  reflection.  The  following  conditions  are  given 


p'  =  p  +  qN  , 


(4.16) 
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and 

q'  =  q> 


(4.17) 


where  the  primes  denote  the  quantities  after  the  reflection.  The  quantity  N  is 
defined  as 


N  = 


4  cz  (cose)2 
c2  sine  ' 


(4.18) 


where  cz  is  the  first  derivative  of  the  sound  speed  in  the  depth  direction  (c2  = 
g  when  c  =  c(z)).  Equations  (4.16)  and  (4.17)  are  considered  valid  even  at 
grazing  incidence.  Since  the  quantity 


=  £ 
q  q 


+  N 


(4.19) 


is  related  to  the  beam  width  and  curvature  (Equations  4.6  and  4.7),  Equation 
(4.19)  indicates  that  the  beam  width  remains  constant  and  the  beam  curvature 
changes  during  a  surface  reflection.  In  this  thesis.  Equation  (4.19)  is  also  used 
for  bottom  interactions.  (Porter,  1987,  pp.  1 351-1352) 
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V.  PARALLEL  PROCESSING 


A.  BACKGROUND 

Multiprocessor  computing  systems  can  offer  many  advantages  over 
conventional  uniprocessor  system  designs,  the  most  obvious  being  increased 
performance.  Other  benefits  include  increased  reliability,  fault  tolerance  and 
scalability  -  the  ability  to  add  performance  as  required.  Performance  however 
is  not  simply  a  linear  function  of  the  number  of  processors  in  a  system.  For 
example,  it  is  very  unlikely  that  a  system  with  N  processors  will  run  N  times 
faster  than  a  single  processor,  in  solving  the  same  problem.  Typically, 
multiprocessor  system  performance  or  throughput,  is  limited  by  many  factors 
including  interconnection  and  communication  schemes.  (Stone,  1987, 
pp.  278-283) 

To  solve  a  problem  on  a  parallel  processor  architecture  first  requires 
dividing  the  problem  up  into  areas  that  can  be  run  concurrently  Once  this  is 
done,  the  methods  of  communication  and  synchronization  between 
processors  must  be  chosen  (Howe,  1987,  p.  36).  The  issue  of  communication 
and  synchronization,  however,  is  usually  determined  by  the  architecture 
itself  and  cannot  be  changed. 

1.  Granularity 

A  useful  term  in  describing  how  a  problem  or  application  is  broken 
up  into  concurrent  activities  is  granularity.  Granularity  in  the  context  of 
parallel  processing  can  be  defined  as  "...an  indicator  of  how  much  computing 
each  processor  can  do  independently  in  relation  to  the  time  it  must  spend 
exchanging  information  with  other  processors."  (Howe,  1987,  p.  37)  The 
granularity  of  an  application  is  referred  to  as  being  either  coarse  or  fine¬ 
grained.  A  course-grained  application  requires  much  more  individual  and 


35 


independent  computation  time  at  each  processor  than  the  time  spent 
communicating  between  the  processors.  A  fine-grain  application  requires  less 
individual  and  independent  computation  time  at  each  processor  between 
periods  of  communications  between  the  processors.  (Howe,  1987,  p.37) 

Another  way  of  looking  at  granularity  is  to  define  two  quantities 
and  C  which  represent  the  time  taken  in  running  the  computational  part  of 
an  application  and  the  time  taken  communicating  results  between  processors, 
respectively.  A  coarse-grained  application  therefore  has  a  relatively  high  %/C 
ratio  while  a  fine-grained  one  has  a  relatively  low  %/C  ratio.  (Stone,  1987,  pp. 
283-284) 

2.  Communication  and  Synchronization 

Two  common  methods  of  processor  communications  are  the 
shared  memory  and  message  passing  approaches.  In  the  shared-memory 
approach,  data  is  communicated  between  processors  by  storing  it  in  a 
common  area  (memory)  where  other  processors  can  read  it.  The  message¬ 
passing  approach  is  a  poin?-Lo-point  scheme  where  data  generated  by  a 
processor  is  given  a  destination  (address).  The  data  is  then  passed  or  routed  to 
the  destination.  The  shared  memory  approach  is  analogous  to  a  bulletin 
board  whereas  a  message-passing  approach  is  analogous  to  mailing  a  letter 
(Howe,  1987,  p.  37). 

In  order  to  ensure  that  data  is  valid,  a  method  of  synchronization 
among  communicating  processors  is  required.  This  is  accomplished 
separately  and  explicitly  in  the  shared-memory  approach  through  the  use  of 
programming  constructs  such  as  semaphores  and  other  locking  mechanisms. 
It  is  up  to  the  programmer  to  use  these  constructs,  for  example,  to  ensure  that 
data  is  not  read  before  it  is  valid.  In  the  message-passing  approach 
synchronization  is  handled  implicitly.  That  is,  if  one  processor  is  waiting  to 
perform  a  communication  operation  with  another  processor,  they  must  both 
be  ready  before  the  communication  can  proceed.  This  is  also  referred  to  as 
blocked  synchronization.  An  example  of  this  can  happen  when  both 


36 


processors  are  at  different  points  in  their  programs  (e.g.,  one  is  still  computing 
while  the  other  executes  a  communications  statement).  The  processor  that 
reaches  the  communication  statement  first  will  be  forced  to  wait  until  the 
other  processor  reaches  the  same  point  in  its  program.  (Howe,  1987,  pp.  37-38) 

B.  T800  TRANSPUTER  ARCHITECTURE 

The  T800  transputer  is  a  specialized  microprocessor  which  integrates  a 
32-bit  processor,  a  floating-point  co-processor,  4Kbytes  of  static  RAM,  four 
INMOS  communication  links  with  a  DMA  controller,  two  timers  and  a 
configurable  external  memory  interface  on  one  VLSI  device  (INMOS,  1987,  p. 
1).  The  term  transputer  is  derived  from  the  words  transistor  and  computer 
and  just  as  transistors  are  the  building  blocks  of  large  and  sophisticated 
electronic  devices,  the  transputer  was  designed  to  be  the  building  block  of 
distributed  computing  systems  (INMOS,  1986,  p.  4).  A  block  diagram  of  the 
T800  internal  organization  is  shown  in  Figure  5.1. 

1.  Central  Processing  Unit 

The  32-bit  central  processing  unit  (CPU)  of  the  T800  consists  of 
instruction  processing  logic,  an  instruction  pointer,  a  workspace  pointer 
which  points  to  local  variables,  an  operand  register  and  an  evaluation  stack 
consisting  of  three  registers  -  A,  B  and  C.  All  instructions  refer  to  the  stack 
implicitly.  For  example,  the  ADD  instruction  adds  register  A  and  B  and  stores 
the  result  in  register  A.  As  shown  in  Figure  5.1,  the  T800  uses  three  internal 
buses  -  a  main  32-bit  bidirectional  address  and  data  bus  used  by  the  CPU  and 
the  floating  point  unit  (FPU)  to  access  internal  and  external  memory  and  two 
unidirectional  buses  used  by  the  CPU  to  access  the  FPU  and  data  links  directly 
(Electronics,  1986,  pp.  54-55).  The  T800  transputer's  instruction  set  follows  the 
load-and-store  approach  which  is  characteristic  of  reduced  instruction  set 
computers  (RISC)  (Gimarc,  1987,  pp.  59-63).  (INMOS,  1987,  pp.  4-5) 
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2.  Floating  Point  Unit 

The  T800  contains  an  integral  64-bit  FPU  which  provides  single  (32 
bit)  or  double  (64  bit)  arithmetic  and  which  conforms  to  the  ANSI-IEEE 
754-1985  floating  point  standard.  The  FPU  is  microcoded  and  operates 
concurrently  with,  and  under  control  of,  the  CPU.  The  FPU  evaluation  stack 


Figure  5.1  T800  Transputer  Block  Diagram 


consists  of  three  registers  AF,  BF  and  CF  which  can  hold  either  32-  or  64-bit 
data.  The  operation  of  the  FPU  evaluation  stack  is  the  same  as  the  CPU 
evaluation  stack  in  terms  of  load  and  store  effects.  Because  the  CPU  and  FPU 
work  concurrently,  it  is  possible  for  the  CPU  to  calculate  source  and 
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destination  addresses  for  the  FPU  while  the  FPU  is  working  on  previously 
supplied  data.  This  is  especially  important  in  operations  involving  arrays  of 
data.  Synchronization  points  are  used  in  the  instruction  stream  wherever 
data  needs  to  be  transferred  between  the  CPU  and  the  FPU.  The  first  processor 
finished  waits  for  the  other  to  complete  its  operation;  the  data  is  then 
transferred  and  both  proceed  again  concurrently.  (INMOS,  1987,  p.  17) 

The  T800  FPU  incorporates  a  fast  normalizing  shifter  because  of  its 
importance  in  performing  floating  point  arithmetic  and  because  it  was 
implementable  in  a  reasonable  amount  of  space  (silicon).  Logic  to  speed  up 
multiplication  and  division  operations  and  support  square  root  calculations 
was  also  added.  Standard  mathematical  functions  (e.g.,  sin,  cos)  are 
implemented  using  a  polynomial  approximation  method  which  is  slower 
than  some  other  methods  but  does  not  require  additional  FPU  hardware. 
(INMOS,  Tech.  Note  6,  pp.  7-8) 

3.  Timers,  Processes  and  Process  Scheduling 

The  T800  contains  two  32-bit  cyclic  timers  which  allow  operations 
such  as  reading  the  time  value,  delaying  execution  until  a  certain  time  has 
been  reached  and  timing  out  for  a  specified  amount  of  time.  One  timer  is 
high  priority  and  increments  every  microsecond  and  the  other  low  priority 
timer  increments  every  64  microseconds. 

Processes  are  one  of  the  fundamental  elements  of  the  transputer's 
model  of  concurrent  processing  -  the  Communicating  Sequential  Process 
(CSP)  model.  The  CSP  model  is  a  predicate  calculus  developed  by  C.A.R. 
Hoare  and  has  some  simple  rules 

•  data  may  not  shared  by  processes  running  concurrently, 

•  all  data  passing  is  done  through  communication,  and 

•  all  communication  is  synchronous. 
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The  CSP  model  therefore  is  basically  identical  to  a  message-passing 
architecture  with  blocked  synchronization.  (Davidson,  1988,  pp.  5-7) 

Processes  start  and  run  until  completion;  they  can  communicate 
with  other  processes,  spawn  other  processes  and  any  number  of  them  can  be 
run  in  parallel.  The  transputer  contains  a  microcoded  scheduler  which 
handles  the  running  of  processes  in  parallel.  Note  that  on  a  single  transputer, 
processes  run  in  parallel  are  actually  timesliced.  The  term  parallel  is  still  used 
in  this  case  because  the  CSP  model  does  not  make  any  assumptions  as  to 
where  processes  are  physically  running  (i.e,  on  which  machines).  Processes 
can  be  either  active  (running  or  waiting  on  the  process  queue  to  run)  or 
inactive  (waiting  for  communication  or  until  a  specified  time). 

Processes  are  run  at  either  low  or  high  priority  with  low  priority 
processes  running  only  when  there  are  no  high  priority  processes  active. 
High  priority  processes  are  run  until  completion  and  are  therefore  expected  to 
run  only  for  a  short  time.  Multiple  high  priority  processes  are  run  one  after 
another  until  all  are  done.  If  no  high  priority  processes  are  able  to  run  then 
the  first  low  priority  process  on  the  low  priority  queue  is  selected  for  running. 
In  order  to  run  several  in  parallel,  a  low  priority  process  is  given  two 
timeslices  (approximately  1  msec  for  each  timeslice)  to  run  before  being 
descheduled  and  put  at  the  end  of  the  low  priority  queue.  Descheduling  can 
only  occur  during  certain  instructions,  or  descheduling  points;  thereby 
ensuring  that  expression  evaluation  within  a  process  is  completed  first. 
Process  switching  times  are  typically  less  than  one  microsecond.  (ENMOS, 
1987,  pp.  6-7) 

The  CSP  model  specifies  that  all  communication  is  synchronous. 
Logically  this  means  that  a  process  waiting  to  communicate  with  another 
process  is  blocked  until  the  other  is  ready.  At  the  hardware  level,  a  process 
waiting  for  communication  is  descheduled  until  it  can  proceed  with  the 
communication.  Because  the  CSP  model  makes  no  assumptions  about  the 
underlying  hardware,  processes  may  be  run  in  parallel  on  the  same  processor, 
on  different  processors  or  a  combination  of  both.  In  whatever  case,  the 
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necessary  code  is  the  same.  Therefore  an  arbitrarily  large  system  with  many 
parallel  processes  can  be  run  on  a  single  transputer  or  many  transputers 
without  modification.  (Davidson,  1988,  pp.  5-7) 

4.  Communication  Links 

Communication  between  processes  is  achieved  through  the  use  of 
channels  and  is  point-to-point,  synchronous  and  unbuffered.  Channels 
between  two  processes  running  on  the  same  transputer  are  implemented  by  a 
word  in  memory.  Channels  between  two  processes  executing  on  different 
transputers  are  implemented  by  physical  links.  Links  consist  of  two  serial, 
unidirectional  wires  that  can  be  connected  directly  between  transputers.  Link 
interfaces  are  TTL-compatible  and  can  therefore  be  connected  directly  up  to 
approximately  two  feet  before  buffering  is  required.  The  link  receivers  use 
phase-locked  loops  to  overcome  phase  differences  between  the  signals  of 
different  transputers  but  are  sensitive  to  skew.  The  links  on  the  T800  can  be 
configured  to  run  at  either  5,  10  or  20  Mbits/sec.  All  links  run  at  the  same 
speed  except  Link  0  which  can  be  set  independently.  (INMOS,  1987,  pp.  42-43) 
(INMOS,  1988,  p.19) 

Data  is  communicated  as  a  series  of  bytes,  each  of  which  must  be 
acknowledged  before  the  next  is  sent.  Bytes  are  sent  as  11 -bit  packets  while 
acknowledgements  consist  of  a  start  bit  followed  by  a  stop  bit.  The  T800 
employs  overlapped  communications  so  that  an  acknowledgement  can  be 
sent  as  soon  as  a  data  packet  has  been  recognized.  This  acknowledgement  can 
also  be  recognized  before  the  the  data  packet  is  completely  sent,  thus  allowing 
the  next  data  packet  to  be  sent  immediately  after  the  last  one.  The 
communication  format  is  shown  in  Figure  5.2.  Data  buffering  is  provided  in 
the  T800  link  hardware  so  that  a  data  rate  of  1.74  Mbytes/sec  can  be  achieved 
in  one  direction  and  2.35  Mbytes /sec  when  data  is  sent  in  both  directions  at 
once.  (INMOS,  Tech.  Note  6,  pp.  12-13) 
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Setting  up  a  link  transfer  requires  approximately  20  cycles  (1  gsec). 
The  T800  uses  an  internal  eight-channel  DMA  controller  so  that,  once  a  link 
transfer  is  setup,  it  can  be  run  autonomously  from  the  processor,  only 
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requiring  one  read/write  cycle  every  32-bit  word  (usually  four  processor  cycles 
every  4  psec).  The  T800  also  uses  a  double  word  buffer.  The  link  hardware 
prefetches  the  next  word  to  be  transferred  into  the  second  buffer  while 
outputting  from  the  first.  It  is  possible  to  run  link  transfers  simultaneously 
on  all  four  links  without  seriously  degrading  the  performance  of  the  CPU. 

5.  Performance 

With  the  T800  transputer  running  at  a  clock  speed  of  20  MHz,  the 
CPU  is  capable  of  performing  10  million  instructions  per  second  (MIPS)  and 
the  FPU  capable  of  1.5  million  floating  point  operations  per  second  (MFLOPS). 
Some  performance  figures  for  double-precision  Whetstones  benchmarks  are 
given  in  Table  5.1  (INMOS,  Tech.  Note  27,  p.  10). 
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TABLE  5.1 


WHETSTONE  BENCHMARKS 


System 

Thousands  of  Double-Precision 
Whetstones  per  second 

T800  (20  MHz) 

(using  on-chip,  50  nsec  RAM) 

4000 

MicroVax  II 

(with  FPA  running  MicroVMS) 

925 

SUN-3 

(MC68020  @  16MHz  and  MC68881  @  12.5MHz) 

790 

VAX  11/780 

(8MB  memory,  FT  A,  running  under  UNIX  4.3BSD) 

715 

C.  PARALLEL  ALGORITHMS 
1.  Main  Types 

Stone  (1987)  discusses  two  basic  types  of  physical  computational 
models  -  the  particle  model  and  the  continuum  model.  Problems  which  are 
aassed  as  particle  models  are  typically  "full-information  functions"  (Stone, 
1987,  p.  196).  A  full-information  function  is  one  in  which  each  output 
quantity  depends  on  all  of  the  input  quantities.  Examples  of  this  are  the  FFT 
and  sorting  problems.  These  problems  can  benefit  from  parallel  processing 
but  are  typically  difficult  to  implement.  Problems  classed  as  continuum 
models  are  much  better  suited  to  parallel  implementation.  When  discretized, 
problems  of  this  form  tend  to  be  localized.  That  is,  each  point  is  dependent 
only  on  its  own  state  and  the  states  of  its  adjacent  neighbors.  Examples  of  this 
are  convective  heat  flow  and  fluid  flow.  (Stone,  1987,  pp.  180-196) 

In  this  thesis  we  are  interested  in  the  ocean  acoustic  ray  tracing 
problem  which  can  be  considered  as  belonging  to  the  continuum  model.  In 
fact  the  ray  tracing  problem  can  be  broken  up  into  parts  that  are  completely 
independent  of  each  other  thus  simplifying  the  parallel  application  further. 
The  parallelization  of  the  ray  tracing  algorithm  is  discussed  in  Chapter  VI. 
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2.  Processor  Workfarm 


a.  General 

A  processor  "workfarm"  is  an  approach  well  suited  for 
implementing  some  continuum  type  problems,  in  particular  those  that  are 
coarse-grained.  A  workfarm  basically  consists  of  multiple  processors  or 
workers,  and  a  controller  processor  which  divides  the  problem  into  smaller 
parts,  or  work  packets,  and  distributes  these  work  packets.  Each  worker 
processor  runs  the  same  code  and  waits  for  work  packets  to  be  sent  from  the 
controller.  Once  the  work  is  completed,  the  results  are  returned  to  the 
controller  and  the  worker  waits  for  another  work  packet.  The  efficiency  of  a 
processor  workfarm  (or  any  multi-processor  system)  is  dependent  on  the 
utilization  of  the  processors.  Efficiency  is  highest  when  the  idle  time  is 
minimized;  that  is,  the  processors  are  kept  as  busy  as  possible  doing  useful 
work. 


Note  that  when  data  is  sent  (e.g.,  work  packets,  results),  it  is 
usually  sent  in  the  form  of  a  message.  Messages  typically  contain  address 
information,  a  message  header  and  the  actual  data  associated  with  that 
message.  The  address,  if  required,  is  used  to  route  the  message  to  the  correct 
processor.  The  message  header  indicates  what  type  of  message  is  being  sent 
and  indicates  the  type,  size,  etc  of  data  that  follows.  Messages  and  message 
formats  for  the  ray  tracing  problem  are  explained  further  in  Chapter  VI. 

A  typical  worker  process  is  shown  in  Figure  5.3.  The  outer  box 
represents  the  i-th  processor.  The  circles  inside  the  box  represent  the  processes 
running  in  parallel;  the  arrows  represent  communication  channels  and  the 
direction  of  the  communication.  The  arrows  (channels)  leaving  the  box 
represent  physical  links  while  the  other  arrows  inside  the  box  represent 
internal  channels. 
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b.  Workfarm  Processes 


The  process  called  Throughput  is  used  to  route  messages  (data) 
either  externally  or  internally.  Messages  destined  for  the  next  processor  are 
routed  through  the  channel  tonexi  and  messages  for  internal  (local)  use  are 
routed  through  the  channel  tolocal.  The  Render  process  is  the  process  that 
actually  performs  the  computational  part  of  the  algorithm.  It  accepts  work 
packets  from  channel  frominbuffer  and  outputs  results  through  channel 
tooutbuffer.  The  Feedback  process  multiplexes  messages  from  external 
processors  on  channel  fromnext  or  internally  from  the  Render  process  on 
channel  fromlocal.  The  messages  are  then  passed  up  the  line  on  channel 
toprev. 


c.  Buffer  Processes 

Buffer  processes  are  used  between  Throughput  and  Render  and 
between  Render  and  Feedback.  Becat  se  the  method  of  blocked  synchronization 
is  used,  these  buffers  ensure  that  neither  the  Throughput  or  the  Feedback 
processes  will  be  blocked.  This  situation  could  arise  for  example,  if  the  Render 
process  was  busy  performing  computations  when  a  new  work  packet  arrived. 
Without  the  buffer  process,  the  process  Throughput  would  be  blocked  from 
sending  the  work  packet  to  the  Render  process  until  it  finished  its  work.  This 
could  then  block  subsequent  messages  destined  for  other  processors,  thus 
degrading  the  efficiency  of  the  entire  system.  By  using  a  buffer  process,  the 
Throughput  process  is  always  able  to  "unload"  internal  messages  and  continue 
routing  other  messages.  The  buffer  process  thus  takes  over  the  responsibility 
of  being  blocked  until  the  Render  process  is  ready  for  more  work.  The  use  of 
buffering  processes  can  effect  the  workfarm  performance  as  indicated  in 
INMOS  Tech  Note  7. 

The  buffer  processes  can  also  be  used  to  buffer  (store)  work 
packets.  By  buffering  work  packets,  the  idle  time  of  the  render  process  can  be 
reduced,  thus  increasing  the  efficiency.  For  example,  if  two  buffer  processes 
are  used  between  the  Throughput  and  Render  processes,  it  would  be  possible  to 
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have  three  work  packets  at  a  given  processor,  at  any  one  time.  The  first  work 
packet  would  be  routed  immediately  to  the  Render  process  for  computation. 
The  second  work  packet  would  be  routed  to  the  Render  process  but  would  only 
get  as  far  as  the  second  buffer  process,  which  would  be  blocked  because  Render 
is  busy.  The  third  work  packet  would  then  only  get  as  far  as  the  first  buffer, 
since  the  second  buffer  is  blocked.  Once  the  Render  process  finishes  its  work,  it 
would  accept  the  work  packet  stored  at  the  second  buffer,  thus  allowing  the 
second  buffer  to  accept  the  work  packet  from  the  first.  By  having  work 
packets  available  to  the  Render  process  immediately,  the  idle  time  normally 
spent  waiting  for  a  new  work  packet  to  arrive  from  the  controller,  is  reduced. 


Processor  i 


Figure  5.3  Typical  Transputer  Workfarm  Processes/Channels 


Another  method  of  storing  multiple  work  packets,  is  to  store 
them  at  the  Throughput  process.  A  new  channel,  Requestmore  is  then  used  by 
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the  Render  process  to  indicate  that  it  has  finished  and  request  another  work 
packet.  The  Throughput  process  then  sends  the  work  packet  to  the  Render 
process.  This  method  can  eliminate  the  need  for  multiple  buffers  between 
the  Throughput  and  Render  processes  but  requires  storage  for  the  buffered  work 
packets,  an  additional  channel  and  program  logic  to  handle  the  requests. 

d.  Controller  Process 

The  controller  function  can  be  performed  by  a  processor  in  the 
network  (i.e.,  a  transputer)  or  by  a  host  computer.  The  controller's  task  is  to 
break  the  problem  into  work  packets  and  distribute  them  to  the  worker 
processors.  The  controller  may  also  be  responsible  for  receiving  and 
processing  the  result  packets  sent  by  the  workers  (e.g.,  graphically  displaying 
results).  Work  packets  can  be  either  addressed  or  not.  If  they  are  addressed, 
then  they  are  routed  to  a  specific  worker  processor  and,  if  not,  they  are 
handled  by  the  first  worker  who  receives  the  work  packet  and  is  able  to 
handle  more  work.  If  n  work  packets  can  be  buffered  by  each  of  the  m  worker 
processors  then  the  controller  starts  by  distributing  nxm  work  packets  to  the 
workers.  From  then  on,  new  work  packets  are  usually  not  sent  out  until  a 
result  packet  has  been  received.  In  this  way  only  nxm  work  packets  are  out  on 
the  worker  network  at  any  given  time. 

e.  Process  Priority 

Another  important  aspect  of  the  workfarm  is  the  process 
priority.  As  previously  stated,  processes  can  be  run  at  either  high  or  low 
priority  with  high  priority  processes  running  until  completion  or  until 
blocked.  It  is  a  common  misconception  to  think  that  the  computational  part 
of  the  algorithm  should  run  at  high  priority  and  the  communication  (routing 
and  buffering  processes)  should  be  run  at  low  priority.  This  type  of  setup 
however  actually  leads  to  decreased  performance.  The  workfarm  should  be 
set  up  so  that  all  communications  are  run  at  high  priority  and  computations 
run  at  low  priority.  If  the  computational  part  of  the  algorithm  were  run  at 
high  priority,  it  would  run  until  completion.  Meanwhile,  any 
communication  or  routing  performed  at  low  priority  would  be  completely 
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halted  until  the  computational  part  was  done.  This  could  mean  that  work 
destined  for  other  processors  would  not  get  through  and  processors  would  be 
idle.  Running  all  communication  at  high  priority  ensures  that  data  is  never 
blocked,  but  rather  routed  immediately.  Since  the  workfarm  approach  is  best 
suited  to  course  grain  problems,  the  amount  of  time  required  for 
communication  is  minimal  anyway,  compared  to  the  computation  time 
required.  Also  since  communication  links,  once  setup,  can  run 
autonomously  from  the  CPU,  the  computations  can  be  restarted  while  data  is 
transferred  by  the  link  engines.  (INMOS,  Tech.  Note  17,  pp.  13-20) 
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VI.  IMPLEMENTATION 


A.  DEVELOPMENT  SYSTEM 

1.  Hardware 

The  hardware  used  for  this  project  included  a  Macintosh  II,  a  Levco 
TransLink  II  transputer  motherboard  and  two  TransLink  modules.  The 
TransLink  II  motherboard  is  a  NuBus  compatible  card  that  can  support  up  to 
four  TransLink  modules.  The  Macintosh  II  can  hold  up  to  five  TransLink  II 
cards  for  a  total  of  20  -transputers.  Each  TransLink  module  consisted  of  one 
T800  transputer  and  1  MByte  of  RAM. 

2.  Software  Tools 

a.  Macintosh 

All  Macintosh  software  was  written  using  Symantec's 
Think  C  4.0  development  system.  Think  incorporates  a  fast  compiler, 
linker,  text  editor,  project  organizer  and  a  source  level  debugger  in  an 
integrated  environment.  Standard  ANSI  C  libraries  (e.g.,  stdio,  math,  etc.)  are 
included  as  well  as  an  object-oriented  class  library  for  creating  Macintosh 
programs. 

b.  Transputer 

A  variety  of  programming  languages  and  software 
development  systems  are  currently  available  for  the  transputer  including  C, 
Pascal,  Fortran,  or  are  soon  to  be  released,  such  as  Ada.  Of  special  note  is 
Occam  which  is  a  high  level  language  designed  specifically  to  express 
concurrent  algorithms  and  their  implementations  on  a  parallel  processing 
network  efficiently  and  easily.  Occam,  which  was  introduced  in  1982,  is  based 
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on  the  concepts  introduced  by  David  May  in  EPL  (Experimental  Programming 
Language)  and  C.A.R.  Hoare  in  CSP.  The  development  of  the  transputer  was 
closely  linked  to  Occam  and  in  essence  represents  an  Occam  architectural 
model  since  many  of  Occam's  constructs  are  implemented  directly  in 
hardware.  ( occam ®  2  Reference  Manual,  Preface) 

Software  for  the  transputer  was  written  using  Logical  Systems 
Transputer  Toolset  which  includes  the  software  tools  necessary  to  write  C 
programs  for  the  transputer.  The  tools  include  a  C  preprocessor,  a  C  compiler, 
an  assembler,  linker  and  a  file  librarian.  The  Transputer  Toolset  runs  under 
the  Macintosh  Programmer's  Workshop  (MPW)  shell  environment.  The  C 
language  was  chosen  over  Occam  so  that  all  software  could  be  written  in  one 
language.  This  made  it  possible  to  write  and  validate  algorithms  on  the 
Macintosh  before  they  were  ported  over  and  run  on  the  transputer.  This  was 
important  since  it  is  especially  hard  to  debug  parallel  software  running  on  a 
transputer. 


B.  RAY  TRACER 

1.  Numerical  Integration 

As  discussed  in  Chapter  III,  the  numerical  integration  method 
chosen  to  solve  the  ray  equation  was  the  Runge-Kutta-Fehlberg  method.  The 
algorithm  was  implemented  as  presented,  but  was  used  to  solve  not  only 
Equations  (2.16)  and  (2.17),  but  also  the  Gaussian  beam  parameters  given  by 
Equations  (4.4)  and  (4.5).  All  integration  was  performed  with  respect  to  the 
differential  change  in  range  (dx),  therefore  Equations  (4.4)  and  (4.5)  were 
modified  as  follows: 


and 


dq  _  c(z)  p(x) 
dx  “  cose  ' 


dp  Cnn  . 

dx  c^(z)  cose  ^  x 


(6.1) 


(6.2) 


50 


Since  the  functions  p(x)  and  q(x)  are  complex  quantities,  they  were  also 
separated  into  their  real  and  imaginary  parts  which  were  solved  separately. 
This  meant  that  a  total  of  six  equations  were  numerically  integrated 
simultaneously. 

Two  variables,  scaleMax  and  scaleMin,  were  used  to  limit  the 
amount  by  which  the  step  size  h  could  be  scaled  after  each  step.  These  values 
were  typically  set  to  2.0  and  0.1  respectively.  This  meant  that  the  step  could 
never  be  increased  by  more  than  twice  its  previous  size  or  reduced  by  more 
than  a  factor  of  0.1,  at  any  one  time. 

2.  Turning  Points 

When  the  argument  inside  the  square  root  in  Equation  (2.16) 
approaches  zero,  this  indicates  that  the  ray  is  approaching  a  turning  point.  As 
the  ray  turns  and  changes  direction,  this  argument  will  ideally  equal  zero  and 
then  begin  to  increase  positively  again.  However,  when  solving  this  equation 
numerically,  it  is  possible  to  cause  this  argument  to  go  negative  or 
equivalently,  to  step  the  ray  path  beyond  the  turning  point.  When  this 
happens  the  algorithm  reduces  the  step  size  by  one-half,  the  variable  scaleMax 
is  set  to  one,  thus  preventing  any  further  step  size  increase,  and  the 
integration  step  is  started  again.  This  process  is  repeated  until  the  step  size  is 
reduced  to  a  value  less  than  the  starting  step  size  hstart.  In  this  way,  the 
turning  point  is  approached  gradually. 

Once  the  step  size  is  reduced  less  than  hstart,  an  approximation  is 
made  to  solve  for  the  turning  point.  The  approximation  assumes  that  the 
sound  speed  profile  is  linear  at  this  point  and  the  ray  path  is  defined  by  the  arc 
of  a  circle,  as  discussed  in  Chapter  II.  The  geometry  used  for  this 
approximation  is  shown  in  Figure  6.1.  The  point  Zj  represents  the  dep  h 
value  where  the  approximation  is  started  and  the  angle  of  the  ray  segment  at 
this  point  is  6.  By  solving  the  relation 


c(Zi+1)  =  c(zj)  +  Azg 


(6.3) 
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for  Az,  where  Az  =  zi+i  -  zu  the  value  of  zi+]  (or  equivalently  ztp)  can  be 
computed.  Similarly,  the  relation 

Ax  =  (sin  0;  -  sin  6i+1)  (6.4) 

ao 

is  used  to  compute  the  corresponding  change  in  range  Ax.  (Clay,  1977, 
pp.  86-87) 


The  angle  at  the  turning  point  is  zero.  The  ray  path  is  then  stepped 
past  the  turning  point  to  z;+2.  The  depth  z,+2  is  set  equal  to  Zj,  the  new  angle  is 
equal  to  8  and  the  range  value  is  incremented  by  Ax.  At  this  point  the 
numerical  integration  is  restarted  using  the  starting  step  size  hstart.  The 
parameter  hstart  was  set  to  25  m  which  meant  that  the  corresponding  change 
in  depth  was  very  small.  Therefore  the  linear  sound  speed  profile 
approximation  is  only  used  very  close  to  the  actual  turning  point. 


Figure  6.1  Turning  Point  Geometry 


3.  Surface  Reflections 

After  each  integration  step  the  depth  value  of  the  new  ray  path 
point  is  tested  to  see  if  it  is  less  than  zero.  If  it  is,  a  flag  is  set  to  indicate  that  a 
surface  reflection  has  been  encountered.  The  step  size  is  then  reduced  in  the 
same  manner  described  above,  and  the  integration  step  is  repeated.  Once 
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again,  when  the  step  size  is  reduced  less  than  hstart,  the  linear  sound  speed 
profile  approximation  is  made.  The  geometry  used  is  shown  in  Figure  6.2. 
The  ray  path  is  first  stepped  from  zj  to  the  surface  (zr  =  0).  The  angle  at  the 
point  of  reflection  (er)  is  computed  using  Snell's  law  and  the  resultant  change 
in  range  Ax  is  computed  using  Equation  (6.4).  The  ray  is  then  stepped  to  zl+i  = 
z,  where  the  new  angle  is  equal  to  the  angle  6  at  Zj. 

4.  Bottom  Reflections 

Bottom  reflections  are  handled  in  a  manner  similar  to  the  surface 
reflections.  After  each  step  the  depth  value  of  the  new  ray  path  point  is  also 


\  0 


Figure  6.2  Surface  Reflection  Geometry 


checked  against  the  bottom  depth  at  that  point.  If  the  ray  path's  depth  value  is 
greater  than  the  bottom  value,  a  flag  is  set  indicating  that  a  bottom  reflection 
has  occured.  The  integration  step  is  then  repeated  with  the  step  size  reduced 
by  one-half.  This  process  is  repeated  until  the  step  size  decreases  below  hstart 
at  which  point  the  ray  is  stepped  into  the  bottom  reflection  point  using  the 
linear  sound  speed  profile  approximation  method.  A  bottom  tolerance  value 
is  used  to  determine  how  close  the  ray  is  stepped  towards  the  bottom.  The 
bottom  depth  is  provided  by  a  separate  function  that  uses  a  cubic  spline  to 
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interpolate  bottom  bathymetry  data.  This  function  takes  a  range  value 
argument  and  returns  the  depth  and  gradient  of  the  bottom,  at  that  range. 

Once  the  ray  path  is  stepped  to  the  bottom  reflection  point,  it  is  then 
reflected  out  from  the  bottom.  The  gradient  of  the  bottom  is  used  to 
determine  the  new  ray  angle  and,  after  a  bottom  reflection,  a  new  value  of 
Snell's  constant  (a)  must  be  computed.  The  geometry  used  in  computing  the 
angle  of  the  reflected  ray  path  is  shown  in  Figure  6.3.  The  incoming  ray  has 
an  incident  angle  b  with  respect  to  the  bottom.  The  reflected  ray  will  also 
leave  with  an  angle  of  b  when  measured  with  respect  to  the  bottom. 
However,  since  all  angles  are  computed  with  respect  to  the  horizontal  this 
approach  must  be  modified.  The  incoming  ray  now  has  an  incident  angle  d 
and  the  bottom,  at  the  point  of  reflection,  has  an  angle  a  with  respect  to  the 
horizontal.  As  shown,  the  reflected  ray  will  have  an  angle  of  c  which  is  equal 
to  a+b.  The  angle  b  however  is  equal  to  a+d;  therefore,  angle  c  is  equal  to  2 a+d. 
The  angle  a  of  the  bottom  can  be  computed  from  the  gradient  (gb)  and  is  equal 
to  tan_1(gb). 


/ 

reflected  ray 


Figure  6.3  Bottom  Reflection  Geometry 


It  should  be  noted  that  there  may  be  cases  where  the  sound  speed  is 
required  at  depths  below  its  maximum  tabulated  depth.  In  :s  case  the 
sound  speed  is  linearly  interpolated  by  the  algorithm  using  a  constant 
gradient  of  0.016  s-1.  This  gradient  represents  the  dependence  of  the  sound 
speed  on  depth  in  isohaline  and  isothermal  water  (Clay,  1977,  p.  90).  If  this 
approximation  is  not  accurate  enough,  then  the  sound  speed  profile  should 
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be  tabulated  to  the  maximum  depth  of  the  bottom.  A  similar  situation  arises 
when  the  bottom  bathymetry  is  not  tabulated  out  to  the  desired  receiver 
range.  In  this  case,  the  bottom,  past  the  last  tabulated  range  value,  is  assumed 
flat  at  a  depth  equal  to  the  last  tabulated  depth  value.  The  bottom  gradient  in 
this  case  is  equal  to  zero. 


C.  GAUSSIAN  BEAMS 


The  implementation  of  the  Gaussian  beam  contributions  was  fairly 
straightforward.  A  copy  of  the  program  used  by  Porter  and  Bucker,  written  in 
Fortran  77,  was  provided  and  used  as  the  basis  for  the  implementation  in  this 
thesis.  The  most  notable  problems  occurred  in  using  complex  variables  in  C. 
Routines  to  handle  complex  math  had  to  be  written  and  complex-valued 
formulas  had  to  be  broken  up  into  their  real  and  imaginary  components. 

Most  of  the  Gaussian  beam  algorithm  is  performed  after  the  ray  path  has 
been  computed.  The  following  portion  of  Equation  (4.11),  however,  can  be 
computed  at  the  same  time  as  the  ray  path 


Y_(05EMt2 

Y  "  q(x)  tr 


+  2 


Cn 

c(z)‘ 


tzfr 


(6.5) 


In  order  to  compute  the  Gaussian  beam  contribution  at  a  particular  receiver 
location,  it  is  convenient  to  store  the  ray  path  positions,  angles,  travel  times 
and  y  values  at  all  points  along  the  ray.  In  this  way,  the  ray  path  normals  at 
each  point  of  the  ray  are  checked  to  see  if  they  bracket  the  receiver  location.  If 
so,  the  corresponding  angle,  travel  time  and  y  values  are  readily  available  for 
use  in  computing  the  beam  contribution. 


D.  PARALLEL  PROCESSING 


1.  Parallelization  of  Ray  Tracer 

It  was  stated  in  Chapter  V  that  the  first  step  in  parallelizing  a 
problem  is  to  determine  which  parts  of  the  problem  can  be  solved 
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concurrently.  In  the  case  of  acoustic  ray  tracing,  each  ray  path  calculation  is 
independent  of  all  others.  Suitable  work  packets  can  therefore  be  formed 
using  individual  ray  path  calculations  as  the  atomic  element.  That  is,  work 
packets  can  be  made  up  of  one  or  multiple  ray  path  calculations.  If  a  work 
packet  consists  of  one  ray  path  calculation,  then  all  that  is  needed  to  describe  it 
is  the  initial  ray  angle.  If  a  work  packet  is  made  up  of  multiple,  equally  spaced 
ray  path  calculations,  then  an  upper  and  lower  angle  and  an  angle  increment 
(8e)  are  required.  Once  the  individual  ray  paths  have  been  determined,  the 
Gaussian  beam  contributions  at  particular  receiver  locations  can  then  be 
summed  from  the  individual  ray  contributions  at  the  same  locations. 

In  this  thesis,  work  packets  are  made  up  of  a  single  ray  path 
calculation  and  are  addressed  to  a  specific  processor.  A  listing  of  the  ray 
tracing  algorithm  is  given  in  Appendix  A.  Note  that  this  listing  is  for  the 
transputer  code  and  therefore  contains  the  workfarm  routines  and  the  ray 
tracing  algorithm. 

2.  Message  Formats 

In  order  to  send  information  to  and  receive  results  from  the 
transputers,  a  number  of  message  formats  were  devised.  Most  messages 
consist  of  a  message  header,  a  processor  identification  number  (address),  the 
length  of  the  message  and  the  associated  data.  The  message  header  indicates 
what  message  is  being  sent  and  the  processor  identification  number  indicates 
what  processor  the  message  is  intended  for.  The  length  of  the  message  is  the 
length,  in  bytes,  of  the  data  to  follow.  The  data  associated  with  a  message  is,  of 
course,  dependent  on  the  type  of  message.  Separate  messages  were  used  to 
send  sound  speed  profile  data,  bottom  bathymetry  data  and  other  parameters 
(e.g.,  integration  scale  parameters)  to  the  transputers.  A  listing  of  the  message 
numbers  used  is  also  given  in  Appendix  A. 
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E.  HOST 


1.  Macintosh  Programming 

The  Macintosh  computer,  first  introduced  in  1984,  has  become 
known  for  its  ease  of  use  and  its  unique  and  consistent  graphical  user 
interface.  Most  of  the  software  required  to  support  the  user  interface  is 
contained  in  ROM  (Read  Only  Memory),  which  is  referred  to  as  the  "toolbox". 
The  toolbox  contains  routines  for  such  things  as  screen  drawing,  windows, 
controls,  file  manipulation  and  memory  management.  The  specifications 
and  descriptions  of  these  routines  are  given  in  the  five  volume  reference  set 
"Inside  Macintosh."  By  using  the  toolbox  routines  where  applicable,  a 
consistent  user  interface  can  be  written  for  any  application.  This  means  that 
many  standard  operations  are  performed  in  the  same  manner  in  all 
application  programs  (e.g.,  saving  and  printing  files)  and,  as  the  Macintosh 
products  evolve,  applications  remain  compatible. 

Although  the  end  user  benefits  from  the  Macintosh  software 
design,  programming  the  Macintosh  is  a  difficult  task.  For  reasons  of  brevity, 
the  software  written  as  part  of  the  Macintosh  host  application  for  this  thesis, 
is  not  given.  However,  the  basic  structure  of  the  main  event  loop  of  the  host 
program  is  given  in  Appendix  B.  The  software  written  for  the  Macintosh 
includes  a  version  of  the  ray  tracer  algorithm  to  be  used  in  the  event  that  the 
program  is  run  on  a  Macintosh  without  transputers  installed. 

2.  Transputer  Interface 

All  data  sent  to  and  received  from  the  transputers  is  sent  as  a  series 
of  bytes.  This  is  handled  by  low-level  device  handling  routines  on  the 
Macintosh  and  channel  communication  routines  on  the  transputer.  The 
ordering  of  the  bytes  is  reversed  between  the  Macintosh  and  the  transputer; 
however,  the  actual  byte  reversal  is  handled  in  hardware  on  the  TransLink 
card  and  is  transparent  to  the  programmer.  The  TransLink  system  is  designed 
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so  that  the  Macintosh-to-transputer  interface  is  logically  seen  as  a  transputer 
channel.  The  Macintosh,  therefore,  can  only  communicate  directly  with  one 
transputer  on  each  TransLink  card. 

F.  OVERALL  ALGORITHM  SETUP 

This  section  gives  a  brief  description  of  the  ray  tracing  algorithm.  When 
the  program  is  first  started,  the  host  program  looks  for  and  boots  the 
transputers.  Once  the  user  has  set  various  parameters  (e.g.,  sound  speed 
profile  data  file,  ray  angle  limits,  etc.),  and  has  selected  the  ray  trace  command, 
a  number  of  data  structures  are  sent  to  the  transputers.  These  include  the 
sound  speed  profile  data,  the  bottom  bathymetry  data,  the  integration 
parameters  and  other  data  required  to  carry  out  the  ray  path  calculations.  The 
host  program  then  sends  the  work  packets  to  the  transputers,  receives  results 
from  them  and  plot?  the  results  on  the  screen  as  they  are  received,  until  all 
rays  have  been  traced.  If  the  user  chooses  to  perform  another  ray  trace, 
perhaps  after  changing  some  parameters,  the  process  described  above,  of 
sending  the  intial  data  structures  and  work  packets,  is  simply  repeated.  In  the 
case  that  no  transputers  are  installed,  the  program  will  still  function  but  all 
computations  will  be  done  on  the  Macintosh  II. 
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VII.  RESULTS 


A.  GENERAL 

This  chapter  presents  some  results  using  the  ray  tracing  and  Gaussian 
beam  algorithms  developed  for  this  thesis  and  results  from  the  optimization 
of  the  parallel  processing  setup.  The  first  of  three  example  ray  traces 
compares  an  analytic  ray  path  with  that  obtained  using  the  algorithm,  the 
second  example  demonstrates  ray  interactions  with  a  sloping  bottom  profile 
and  the  third  example  traces  rays  for  the  Munk  canonical  deep-water  sound 
speed  profile.  Two  Gaussian  beam  examples,  based  on  the  Munk  canonical 
deep-water  sound  speed  profile,  are  presented  as  well  as  plots  of  the 
behaviour  of  the  functions  used  in  computing  the  Gaussian  beams.  Finally 
the  results  of  the  optimization  of  the  parallel  processing  scheme  are 
presented. 

B.  RAY  TRACING 

1.  Comparison  with  Analytic  Example 

To  demonstrate  the  accuracy  of  the  ray  tracing  code,  a  simple 
example  is  presented  that  can  be  verified  by  comparing  it  to  an  analytic 
solution.  A  linear,  upward-refracting  sound  speed  profile  was  chosen  for  this 
example  and  is  shown  in  Figure  7.1.  Because  the  profile  is  linear,  the  ray  path 
follows  the  arc  of  a  circle  whose  radius  is  given  by  Equation  (2.21).  To 
simplify  the  example  further,  an  initial  ray  angle  of  0°  and  a  source  depth  of 
1000  m  was  also  chosen.  The  analytic  solution  yields  the  following  results 

Snell's  constant  =  a  =  6.83  x  10~4  s/m 


and 


radius  =  ~~  =  9.16  x  104  m. 

ag 

The  horizontal  distance  travelled  by  the  ray  in  any  layer  is  given  by 
(Clay,  1977,  p.  87) 

Xf  -  x,  =  (radius)  (sin  9j  -  sin  Bf).  (7.1 ) 

Note  that  Equation  (7.1)  has  been  modified  to  take  into  account  the  fact  that 
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Figure  7.1  Linear,  Upward-Refracting  Sound  Speed  Profile 
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all  angles  used  for  the  ray  tracing  code  are  measured  with  respect  to  the 
horizontal  and  not  the  vertical  as  in  the  reference.  Solution  of  Equation  (7.1), 
with  Xj  =  0  m  (i.e.  at  the  source)  and  xf  the  range  to  the  first  surface  reflection, 
yields 


Xf  =  1.3495  x  104  m. 

The  quantity  Xf  is  also  the  distance  between  the  sucessive  reflection  points  and 
turning  points,  as  labelled  in  the  ray  path  plot  of  Figure  7.2  (i.e.  X2  -  x\  is  equal 
to  X5-  X4,  etc.). 

A  comparison  of  the  results  between  the  analytic  and  numerical 
solutions  is  given  in  Table  7.1.  Note  that  the  numerical  solution  was 
computed  using  an  integration  tolerance  of  1  x  10"6. 


TABLE  7.1 


ANALYTIC  VERSUS  NUMERICAL  SOLUTION 


Range 

Point 

Analytic  Solution 

[x  104  m] 

Numerical  Solution 
[x  104  m] 

xi 

1.3495 

1.3495 

2.6991 

2.6990 

4.0486 

4.0486 

5.3981 

5.3981 

1 

6.7477 

6.7476 

8.0972 

8.0971 

9.4468 

9.4466 

As  can  be  seen  from  these  results,  the  ray  tracing  code  is  very  accurate  - 
differing  by  a  maximum  of  two  meters  in  the  total  100  km  range.  A  smaller 
integration  tolerance  would  have  yielded  even  higher  accuracy. 
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2.  Example  with  Bottom  Bathymetry 

The  next  example  demonstrates  the  ability  of  the  ray  tracing  code  to 
handle  bottom  bathymetry  data  and  ray  intersections  with  the  bottom.  As 
shown  in  Figure  7.3,  a  linear,  downward-refracting  sound  speed  profile  was 
chosen  so  that  all  rays  would  be  refracted  into  the  bottom. 


c(m/s) 
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Figure  7.3  Linear,  Downward-Refracting  Sound  Speed  Profile 


The  resultant  ray  plot  for  a  source  depth  of  3000  m,  initial  ray  angle 
of  -5°  and  an  upward-sloping  bottom  profile  is  shown  in  Figure  7.4.  The  ray 
path  exhibits  behavior  as  would  be  expected  for  rays  travelling  up  a  sloped 
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bottom.  As  the  ray  propagates  up  the  slope,  successive  reflections  (both 
bottom  and  surface  reflections)  occur  closer  and  closer  together.  This  is  due  to 
the  fact  that  when  the  ray  reflects  off  the  bottom,  its  reflection  angle  (with 
respect  to  the  horizontal)  is  increased  by  twice  the  slope  of  the  bottom  at  the 
point  of  reflection  (see  Figure  6.3).  This  causes  the  ray  path  to  'bunch-up'  as  it 
travels  up  the  slope.  Under  certain  conditions,  this  effect  can  even  cause  the 
ray  path  to  go  vertical  and  then  head  back  towards  the  source. 
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3.  Munk  Profile  Example 

This  example  uses  a  canonical  sound  speed  profile  given  in 
Porter  (1987),  which  is  referred  to  as  Munk's  canonical  deep-water  sound 
speed  profile.  The  profile  is  defined  by  the  equation 

c(z)  =  1500  {1.0  +  0.00737[u  - 1  +  e'u]},  z  <  5000m  (7.2) 

where 


u  =  2(z-  1300)/ 1300.  (7.3) 

A  plot  of  this  sound  speed  profile  is  given  in  Figure  7.5  and  a  ray  trace  plot 
using  this  profile  is  given  in  Figure  7.6.  Note  that  in  this  sound  speed  profile 
plot  the  box  symbols  represent  tabulated  or  input  values.  The  rays  are  traced 
from  +14°  to  -14°  with  an  angular  spacing  of  0.5°  between  rays  from  a  source  at 
1000  m  depth.  This  plot  compares  favorably  to  that  presented  in  Porter  (1987) 
and  is  used  in  subsequent  discussions  about  Gaussian  beam  results. 

C.  GAUSSIAN  BEAMS 

This  section  provides  some  preliminary  results  obtained  using  the 
Gaussian  beam  algorithm.  The  results  are  for  single  receiver  points  only 
(i.e.  not  total  field  calculations)  and  a  source  frequency  of  500  Hz.  The  ray 
tracing  conditions  (i.e.  sound  speed  profile,  etc.)  are  those  presented  above  for 
the  Munk  profile  example.  Note  that  the  term  Gaussian  beam  contribution 
used  in  subsequent  discussions  refers  to  the  fact  that  at  a  given  receiver 
location,  each  ray  will  contribute  to  the  acoustic  field. 

1.  Example  1  -  Receiver  at  Range  68  km,  Depth  1500  m 

For  the  first  example,  Gaussian  beam  contributions  were  computed 
for  a  receiver  location  at  a  range  of  68  km  and  a  depth  of  1500  m.  This 
receiver  location  was  positioned  away  from  apparent  shadow  zones  and 
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caustics  to  provide  a  more  simplified  example.  The  Gaussian  beam 
contributions  versus  the  initial  ray  angle  are  shown  in  Figure  7.7. 


As  can  be  seen  from  Figure  7.7  a  peak  occurs  at  2.5°  (=  0.04  radians) 
and  dropouts  occur  at  13°  (=  0.23  radians)  and  at  -10°  (=  -0.17  radians).  A  plot 
of  these  three  ray  paths  is  given  in  Figure  7.8. 
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Figure  7.6  Munk  Profile  Ray  Plot 
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Initial  Ray  Angle 


Figure  7.7  Gaussian  Beam  Contributions  vs.  Initial  Ray  Angle 


A  plot  of  the  beamwidth  or  effective  beam  radius  L(x),  as  given  in 
Equation  (4.6),  provides  an  explanation  for  the  peak  and  dropout  locations. 
Figure  7.9  below  illustrates  the  behaviour  of  the  Gaussian  beam  field  as  a 
function  of  the  ray  path.  Figure  7.9  shows  a  0°  ray  path  plot  with  a  Gaussian 
beam  field  superimposed  upon  it,  reflecting  the  results  of  Figure  2  in  Porter, 
1989.  The  shaded  areas  depict  the  Gaussian  beam  contribution  with  the 
darker  areas  representing  the  highest  values  (i.e.  higher  intensity,  lower 
transmission  loss,  etc.).  The  purpose  of  this  figure  is  to  simply  i’lustrate  the 
focusing  and  de-focusing  effects  of  the  Gaussian  beam  as  a  function  of  the  ray 
path  and  its  effect  on  contributions  at  the  receiver  location.  It  can  be  seen  in 
Figure  7.9  that  although  the  ray  path  passes  closer  to  receiver  2,  it  would  have 
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Figure  7.9  Example  Gaussian  Beam  Field 


a  larger  Gaussian  beam  contribution  at  receiver  1.  Therefore  receiver  2  is  too 
close  to  a  focusing  zone  to  see  any  significant  contribution  from  the  ray. 

It  is  this  focusing  and  de-focusing  effect  that  is  the  reason  for  the 
dropouts  in  the  Gaussian  beam  contribution  of  Figure  7.7.  The  dropouts  at 
-10°  and  13  °  are  caused  by  the  fact  that  the  corresponding  Gaussian  beams  are 
focused  and  have  small  beamwidths  at  the  receiver  range.  The  beamwidth  of 
the  2.5°  ray  is  also  focused  at  the  receiver  location,  however  not  as  much  as 
the  -10°  and  13°  rays.  This  is  displayed  further  in  Figures  7.10  through  7.12 
which  plot  the  beamwidth  L(x)  as  a  function  of  range  (x)  for  the  13°,  2.5°  and 
-10°  ray  paths  respectively. 
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Figure  7.10  L(x)  vs.  x  for  13°  Ray 
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2.  Example  2-5°  Ray  Path 

The  second  example  presented  takes  a  reverse  approach  in  order  to 
confirm  the  ideas  presented  above.  In  this  example  an  arbitrary  ray  path 
(other  than  13°,  2.5°  or  -10°)  was  chosen  and  a  plot  of  the  beamwidth  versus 
the  range  was  made  for  that  ray  path.  A  range  at  which  the  beam  focused  was 
chosen  and  the  Gaussian  beam  contributions  were  then  calculated  at  this 
range  value  to  see  if  a  dropout  existed  corresponding  to  the  ray  angle. 
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The  ray  angle  chosen  was  5°  and  a  plot  of  the  beamwidth  versus  the 
range  for  this  ray  is  shown  in  Figure  7.13.  The  ray  focuses  at  approximately 
15.7  km,  therefore  this  range  was  chosen  as  a  receiver  point.  Once  again  the 
receiver  was  located  at  1500  m  depth.  A  plot  of  the  ray  path  and  receiver 
location  is  shown  in  figure  7.14.  A  plot  of  the  Gaussian  beam  contributions  at 
the  receiver  location  is  shown  in  Figure  7.15.  It  is  clear  from  Figure  7.15  that  a 
dropout  occurs  at  the  chosen  ray  of  5°  as  expected. 
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Figure  7.14  5°  Ray  Path  Plot 
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3.  Behaviour  of  p(x)  and  q(x) 

As  stated  previously  in  Chapter  4,  the  quantities  p(x)  and  q(x)  (and 
similarly  p(s)  and  q(s))  are  derived  from  a  parabolic  equation  solution  in  the 
vicinity  of  each  ray.  They  are  related  to  the  beamwidth  L(x)  and  curvature 
K(x)  of  the  beam  associated  with  a  particular  ray  path  and  are  given,  in  ray 
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Figure  7.15  Gaussian  Beam  Contributions  at  x  =  16.5  km,  z  =  1500  m 


centered  coordinates,  in  Equations  (4.6)  and  (4.7).  The  quantities  p(x)  and  q(x) 
are  solved  along  with  the  ray  equation  and  are  complex  functions.  Plots  of 
p(x)  and  q(x)  as  a  function  of  range  are  given  below  in  Figures  7.16  and  7.17 
respectively  for  the  2.5°  ray.  These  plots  are  provided  for  illustrative  purposes 
since  p(x)  and  q(x)  are  such  fundamental  quantities  in  computing  the 
Gaussian  beams.  For  additional  information  on  the  derivation  of  p(x)  and 
q(x),  see  Appendix  A  of  Porter,  1987. 
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D.  PARALLEL  PROCESSING 


1.  General 

The  parallelization  of  the  ray  tracing  and  Gaussian  beam  algorithms 
involved  work  in  many  areas  including  the  design  and  implementation  of 
the  workfarm  process  structure  and  the  implementation  of  the  workfarm  and 
host  interface  routines.  This  section  presents  the  results  of  that  effort  and 
results  of  techniques  for  optimizing  the  parallel  processing  setup.  These 
optimization  techniques  focused  on  the  use  of  on-chip  RAM  and  did  not 
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involve  optimization  beyond  the  intended  workfarm  concept.  Any 
optimization  of  the  parallel-processor-based  Gaussian  beam  tracing  algorithm 
by  using  a  method  other  than  the  parallel  processing  workfarm  is  beyond  the 
scope  of  this  thesis  and  is  one  of  the  recommended  areas  of  future  work. 

2.  General  Observations 

a.  Data  Reporting 

One  concern  in  designing  the  parallel  algorithms  was  the 
reporting  of  data  from  the  transputer  network  back  to  the  Macintosh  II  host. 
Once  the  transputers  had  finished  a  ray  path  calculation  (i.e.  the  entire  ray 
path)  it  had  to  be  displayed  on  the  screen  and  therefore,  the  ray  path  data  had 
to  be  returned  to  the  host.  However,  this  proved  to  a  lengthy  process  in 
which  any  performance  gains  obtained  by  using  the  transputers  was  degraded 
in  data  transfer.  Performance  was  reduced  further  by  the  host  having  to 
convert  ray  path  data  into  screen  coordinates  for  plotting.  An  alternative 
solution  was  used  whereby  the  transputer  first  converted  the  ray  path  data 
into  Macintosh  screen  coordinates  (i.e.  x  and  y  values).  This  not  only  reduced 
the  size  of  the  data  that  had  to  be  transferred  but  also  reduced  the  processor 
workload  on  the  Macintosh  II  host. 

b.  ProcAlt  Function 

Another  problem  arose  in  returning  results  from  the 
transputers  to  the  host.  This  was  caused  at  the  feedback  process  and  involved 
the  alternation  between  the  two  channels  fromlocal  and  fromnext  (see  Figure 
5.3).  Recall  that  the  channel  fromlocal  accepts  internal  data  and  the  channel 
fromnext  accepts  data  from  the  next  transputer  in  the  network.  The  parallel  C 
function  ProcAlt  is  used  to  alternate  between  input  on  multiple  channels.  Its 
usage  and  syntax  are  given  in  the  following  example: 


idx  =  ProcAlt (fromlocal, fromnext, 0)  ; 
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In  the  example,  the  function  P  roc  Alt  checks  for  any  input  ready  on  either  of 
the  two  channels  fromlocal  or  fromnext.  If  input  is  ready  on  fromlocal  it  returns 
a  value  of  zero  and  if  it  is  ready  on  fromnext  it  returns  a  value  of  one.  If 
neither  channel  is  ready  it  returns  a  value  of  -1.  The  problem  arises  when 
both  channels  have  data  ready  simultaneously  which  was  the  usual  case.  The 
function  ProcAlt  tends  to  favor  the  channel  appearing  first  in  the  argument 
list,  in  this  case  fromlocal.  This  meant  that  any  transputers  further  along  in 
the  network  would  be  blocked  from  returning  their  results  and  from  doing 
any  more  work.  In  fact,  only  the  first  transputer  would  end  up  doing  any 
work,  thus  defeating  the  purpose  of  the  parallel  workfarm. 

The  solution  to  this  problem  was  to  simply  toggle  between  two 
separate  calls  to  the  ProcAlt  function  on  successive  passes  through  the  feedback 
process  code.  The  calls  were  the  same  except  that  the  channel  arguments 
were  reversed  in  order  between  the  two  statements.  Therefore  the  ProcAlt 
function  call  would  still  favor  the  channel  appearing  first  in  the  argument  list 
but  would  be  forced  to  alternate  between  them  properly. 

c.  Debugging 

Debugging  software  that  runs  on  a  parallel  processor  workfarm 
is  a  difficult  process  for  two  main  reasons.  The  first  reason  is  the  fact  that 
communications  between  the  host  and  the  network  of  transputers  is 
performed  over  a  single  serial  link.  This  means  that  any  debugging 
information  obtained  is  usually  done  using  special  message  formats  designed 
for  debugging.  It  is  therefore  important  to  design  message  formats,  etc.  to  be 
flexible  from  the  start  and  with  debugging  in  mind.  Some  transputer 
network  analyzers  do  exist  but  were  not  available  for  this  thesis  work.  The 
second  problem  also  arises  from  the  fact  that  all  inter-transputer 
communication  is  performed  over  serial  links  and  is  block-synchronized.  As 
stated  previously  in  Chapter  5,  block  synchronization  means  that 
communication  is  not  performed  until  both  channels  are  ready.  This  method 
of  synchronization  can  lead  to  a  condition  known  as  deadlock  whereby  one 
process  may  be  waiting  to  perform  communication  with  another,  but  is 
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unable  to  do  so.  This  process  in  turn  causes  another  to  wait  for 
communication  (i.e.  block).  This  condition  is  repeated  throughout  until  the 
entire  network  is  in  a  state  of  deadlock.  Deadlock  conditions  can  either  be 
difficult  or  easy  to  solve  and  is  not  a  problem  found  in  conventional 
sequential  algorithms. 

d.  Work  Packet  Buffering 

As  stated  previously  in  Chapter  5,  work  packets  can  be  buffered 
in  one  of  two  ways.  In  this  thesis  buffer  processes  were  used  between  the 
throughput  and  render  and  between  the  render  and  feedback  processes.  This 
eliminated  both  the  need  to  buffer  work  packets  in  the  throughput  process  and 
the  need  for  a  requestmore  channel.  Different  numbers  of  bcffer  processes 
were  tried,  varying  between  one  and  three.  In  this  particular  setup,  no 
noticeable  differences  were  observed  when  the  number  of  buffers  was  varied. 
That  is,  little  time  was  spent  waiting  for  more  work  to  arrive.  This  is  due  to 
the  fact  that  the  granularity  of  the  problem  was  relatively  large  and  the 
amount  of  time  spent  communicating  was  small  in  comparison  to 
computational  time. 

3.  Results  of  Optimization 

One  of  the  most  effective  optimization  methods  is  to  maximize  the 
use  of  the  on-chip  RAM  of  the  transputer.  Although  only  4  KBytes  of  on-chip 
RAM  are  available,  its  50  nsec  access  time  makes  it  three  times  faster  than  the 
off-chip  RAM  (150  nsec).  It  is  therefore  best  to  utilize  it  whenever  possible. 
Three  different  memory  allocation  arrangements  were  tried.  The  first  was  to 
use  off-chip  RAM  exclusively.  The  second  arrangement  used  on-chip  RAM 
for  the  stack  space  of  the  Render  process  and  the  third  involved  placing  certain 
data  structures  (variables)  and  frequently-called  routines  in  the  on-chip  RAM. 
The  results  are  given  in  Figure  7.18.  This  figure  shows  the  time  required  to 
compute  each  of  11  ray  paths,  from  5°  to  -5°  with  an  angular  spacing  of  1° 
between  rays.  The  corresponding  results  for  the  Macintosh  II  are  also  shown. 
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It  can  be  seen  from  Figure  7.18  that  the  use  of  on-chip  RAM  for  the 
render  process  stack  space  gave  the  best  results  -  at  least  twice  as  fast  as  the 
Macintosh  II.  Trying  to  place  some  data  and  frequently  called  functions  in  the 
on-chip  RAM  seemed  to  have  little  effect  in  speeding  up  th^  program.  Some 
performance  figures  given  for  the  T800  claim  that  it  can  perform  six  times 
faster  than  the  Motorola  68020/68881  combination,  as  found  in  the 
Macintosh  II  (Electronics,  1986,  p.  52).  These  figures  are  usually  determined  by 
using  program  code  that  is  run  completely  in  the  T800's  on-chip  RAM,  which 
has  an  access  time  of  50  nsec.  In  our  case,  the  results  for  the  program  run 
completely  in  off  chip  RAM  (150  nsec  cycle  time)  are  not  quite  twice  as  fast  as 
the  Macintosh  II.  Therefore,  assuming  that  if  the  program  could  run 
completely  using  on-chip  RAM,  it  would  run  between  four  and  six  times  as 
fast  as  the  Macintosh  II  version. 

The  fact  that  only  a  factor  of  two  performance  gain  is  realized  in  any 
one  transputer  over  the  Macintosh  II  does  not  seem  significant  at  first.  It 
must  be  realized  however  that  by  using  the  transputer  workfarm,  this 
performance  gain  can  be  further  multiplied  by  the  number  of  transputers  in 
the  network.  Therefore  with  20  transputers,  a  speedup  of  approximately  40 
can  be  achieved.  Of  course  an  upper  limit  exists  on  the  number  of  transputers 
that  can  be  added  before  other  problems,  such  as  communication  throughput, 
actually  degrade  performance.  This  upper  limit  could  not  be  found  since 
additional  transputer  resources  were  not  available  for  this  thesis. 
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VIII. 


CONCLUSIONS  AND  RECOMMENDATIONS 


A.  CONCLUSIONS 

This  thesis  has  attempted  to  develop  a  ray  tracing  model  suitable  for 
predicting  multipath  arrivals  and  associated  information  such  as  travel  time 
and  amplitude,  for  use  in  ocean  acoustic  tomography  problems. 

The  ray  tracing  code  developed  in  this  thesis  uses  the  Runge-Kutta- 
Fehlberg  method  to  integrate  the  differential  equations  used  in  determining 
the  ray  paths  and  associated  Gaussian  beams.  This  numerical  integration 
method  provides  flexibility,  accuracy  and  is  more  efficient  than  some  other 
methods.  The  resultant  ray  tracing  algorithm  is  also  flexible  and  very 
accurate. 

The  method  of  Gaussian  beams  is  used  to  estimate  the  arrival 
amplitudes  at  a  particular  receiver  location.  This  information  can  be  used  to 
scale  and  estimate  other  useful  quantities  such  as  the  pressure  or  intensity. 

This  thesis  has  also  shown  that  it  is  relatively  easy  to  take  advantage  of 
parallel  processing  without  having  to  buy  expensive  and  specialized 
equipment  and  software  development  tools.  The  transputer  offers  a 
relatively  cheap  and  efficient  solution  to  parallel  processing  without 
requiring  large  amounts  of  space  (20  in  one  Macintosh  II). 
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B.  RECOMMENDATIONS 


This  section  lists  some  areas  and  topics  related  to  this  thesis  that  are 
suitable  for  further  research. 

1.  Ray  Tracing 

The  model  could  be  made  into  a  full,  three-dimensional  model 
The  ray  tracing  could  take  into  account  the  range  and  azimutha’  dependency 
of  the  sound  speed  and  the  bottom  could  be  described  using  three 
dimensional  bathymetry.  In  order  to  accomplish  this,  new  equations  are 
required  for  defining  the  ray  path  and  the  Gaussian  beams.  Also,  a  suitable 
method  for  interpolating  the  sound  speed  profile  and  bottom  bathymetry  data 
in  three-dimensions  is  required.  In  addition  to  the  more  complex  bottom 
bathymetry,  some  form  of  bottom  loss  model  could  also  be  added  to  improve 
the  accuracy  of  the  model. 

2.  Parallel  Processing 

Only  two  transputers  were  used  in  this  thesis  to  implement  the 
parallel  processing  workfarm.  Additional  transputer  resource*  are  available 
at  the  Naval  Postgraduate  Schoc'  and  should  be  used  to  determine  at  what 
point  the  workfarm  speedup  peaks  (i.e.,  how  many  processors).  Also,  if  the 
model  is  made  more  complex  (e.g.  three-dimensional),  it  may  be  necessary  to 
break  the  ray  tracing  problem  into  different  parts  that  can  be  run  concurrently. 
For  example,  due  to  the  large  amount  of  information  that  may  be  required  m 
a  three-dimensional  problem  (sound  speed  profiles,  bathymetry  data),  it  may 
be  more  efficient  for  one  processor  to  store  this  information.  Other  processors 
would  then  request  (via  messages)  sound  speed  and  bottom  bathymetry 
information  from  this  processor. 
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3.  Miscellaneous 


Although  the  Gaussian  beam  contributions  were  set  up  to  retain 
travel  time  information  (i.e.  an  impulsive  type  source)  the  ability  to  compute 
the  total  acoustic  field,  could  be  added.  Also,  w  ith  the  Macintosh  II,  the 
addition  of  color  plots  would  be  useful  in  displaying  this  field. 
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APPENDIX  A 


TRANSPUTER  SOFTWARE  ROUTINES 


This  appendix  gives  a  brief  descrip  lion  and  listing  of  the  software  written 
to  run  on  the  transputer.  This  includes  the  routines  used  to  implement  the 
workfarm  and  the  routines  used  to  perform  the  ray  path  and  Gaussian  beam 
calculations. 

A.  WORKFARM 

The  routines  used  to  provide  the  workfarm  capabilities  are  implemented 
as  discussed  in  Chapter  VI.  One  notable  exception  is  the  use  of  the  buffer 
processes.  Instead  of  one  buffer  process  on  either  side  of  the  render  process,  a 
total  of  six  are  used.  This  still  allows  work  packets  to  be  buffered  in  a  different 
fashion.  Variables  to  store  extra  work  packets  are  not  required  in  the 
throughput  process  and  the  channel  recjuestmore  is  not  required. 

B.  RAY  TRACER 

1.  Numerical  Integration 
a.  RKF4 

This  routine  represents  the  core  of  the  ray  tracing  algorithm.  It 
is  used  to  perform  the  Runge-Kutta-Fehlberg  numerical  integration.  This 
routine  calls  reducestep  if  a  turning  point  or  other  boundary  interaction  is 
encountered. 
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b.  reducestep 


This  routine  is  called  whenever  a  turning  point,  surface 
reflection  or  bottom  reflection  is  encountered.  This  routine  simply  decreases 
the  step  size  and  ensures  that  the  integration  step  cannot  increase  further 
before  returning.  If  the  step  size  is  reduced  below  hstart,  this  routine  then 
calls  turnpoint. 

2.  Turning  Points 

a.  turnpoint 

This  "outine  is  called  whenever  a  turning  point,  surface 
reflection  or  bottom  reflection  is  encountered  and  the  step  size  has  been 
previously  reduced  less  than  hstart.  The  routine  turnpoint  handles  turning 
point  and  surface  reflection  approximations.  Bottom  rpfl.ections  are  handled 
by  calling  the  routine  bottomstep. 

b.  bottomstep 

This  routine  handles  stepping  the  ray  to  the  bottom  and  also 
steps  the  ray  path  away  from  the  bottom,  using  the  linear  sound  speed  profile 
approximation.  The  ray  path  is  stepped  to  within  a  user-specified  distance 
from  the  bottom. 

3.  Gaussian  Beams 
a.  Gamm 

This  routine  computes  a  portion  of  the  Gaussian  beam 
formula  at  the  same  time  as  the  ray  path,  as  discussed  in  Chapter  VI. 
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b.  sqrtBranch 

This  routine  determines  whether  the  function  q(s)  has  crossed 
the  imaginary  axis.  If  it  has,  then  the  array  index  (used  to  store  ray  path 
points)  is  stored  for  later  use. 

c.  GaussSumm 

This  routine  computes  the  Gaussian  beam  summation  at  a 
particular  receiver  point.  This  routine  is  called  after  the  entire  ray  path  has 
been  computed. 

4.  Support  Routines 

a.  control 

This  is  the  first  routine  called  once  a  new  work  packet  has  been 
received.  It  in  turn  calls  the  routine  RKF4  to  compute  the  ray  path  and  then 
the  routine  GaussSumm  to  compute  the  Gaussian  beam  summation.  This 
routine  then  converts  the  ray  path  coordinates  (x,y,z)  into  screen  coordinates 
(h,v)  for  the  Macintosh.  The  screen  coordinate  data  is  then  sent  to  the 
Macintosh  via  the  buffer  processes. 

b.  c(z) 

This  routine  returns  the  speed  of  sound,  along  with  its  first 
and  second  derivatives,  at  a  given  depth  value.  These  values  are  computed 
using  the  cubic  spline  interpolation  method. 

c.  bottomval 

This  routine  returns  the  bottom  depth  and  gradient  at  a  given 
range  value.  These  values  are  computed  using  the  cubic  spline  interpolation 
method. 
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d.  fen 


This  function  is  used  to  evaluate  the  ray  equation  at  the  given 
depth  value.  The  solutions  for  the  travel  time  and  the  Gaussian  beam 
equations  are  also  computed. 

e.  setup 

Ths  routine  converts  the  initial  ray  angle  from  degrees  to 
radians,  computes  the  initial  Snell's  constant  value  and  sets  up  the  initial 
conditions  for  the  Gaussian  beam  solution. 

f.  tpq 

This  function  computes  the  travel  time  and  the  p(s)  and  q(s) 
values  using  the  linear  sound  speed  approximation. 

g.  increment 

This  routine  simply  increments  the  array  index  used  to  store 
the  ray  path  and  Gaussian  beam  values.  It  also  checks  to  see  if  the  array  index 
is  within  the  specified  bounds. 

h.  Complex  Math  Functions 

Complex  math  functions  are  not  handled  intrinsically  in  C,  as 
they  are  in  Fortran.  These  routines,  therefore,  provide  simple  complex  math 
operations  such  as  add,  multiply,  divide,  etc.,  given  two  complex  arguments. 

C.  MAIN  LISTING 

/**********************************************************************/ 

♦include  <conc.h>  /*  concurrency  routines  for  transputer  */ 

♦include  <math.h>  /*  standard  math  library  */ 

♦include  "messagelD . h"  /*  message  constants  */ 

♦include  "raydefs.h"  /*  ray  tracing  constants  and  structures  */ 

♦define  TRUE  1 

♦define  FALSE  0 


91 


♦define  true 
♦define  false 


0 


♦define  MAXBUFFER  512 

♦define  STACKS I ZE  1000  /*  stack  size  for  Throughput 

and  Feedback*/ 

♦define  CHIPRAMSIZE  3500  /*  amount  of  on-chip  RAM  used  for  Render 

stack  frame  */ 

♦  define  getinpacket (c, p, s)  Chanln  (c,  (char  *)&p,s) 

♦  define  putinpacket  (c, p, s)  ChanOut (c,  (char  *)&p,s) 

♦  define  getoutpacket (c,p, s)  Chanln (c,  (char  *)&p,s) 

♦  define  putoutpacket (c, p, s)  ChanOut  (c,  (char  *)fip,s) 

int  ourlD  =0;  /*  our  unique  ID  number  */ 

int  downstream  =  FALSE;  /*  FALSE  =  no  transputers  after 

this  one  */ 

/■ft********************************************************************* 

♦ 

*  Throughput  process  of  standard  workfarm 

*/ 

int  throughput (procdesc, fromprev,  tonext,  tolocal) 
int  procdesc; 

Channel  ‘fromprev,  *tonext,  *tolocal; 

( 

int  msg, len; 

int  procID; 

Channel  *chan; 

int  dead, idx; 

char  *buff; 

double  angle; 

dead  =  FALSE; 

while  (idead)  (  /*  run  until  the  power  is  turned  off  !  */ 

do  { 

idx  =  ProcAlt (fromprev, 0) ;  /*  wait  for  something  on  */ 

}  while  (idx  ==  -1);  /*  fromprev  channel  */ 

switch  (idx)  { 

case  0:  /*  fromprev  */ 

msg  =  Chanlnlnt (fromprev) ;  /*  get  message  type  */ 
switch (msg) { 

case  MSG_PASS:  /*  used  for  passing  boot  code 

downstream  */ 

procID  =  Chanlnlnt (fromprev) ;  /*  get  processor  ID  */ 
len  =  Chanlnlnt (fromprev) ;  /*  length  of  message  */ 

/*  if  downsteam  =  FALSE  then  pass  only  data 
of  message  ie.  the  boot  code  */ 
if  (downstream  ==  FALSE) f 

passdata (fromprev, tonext, len) ; 
downstream  =  TRUE; 

} 

else{  /*  pass  the  whole  thing  */ 
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ChanOutlnt  (tonext, MSG_PASS) ; 

ChanOutlnt (tonext, procID) ; 

ChanOutlnt (tonext, len) ; 
passdata ( f romp rev, tonext , len) ; 

) 

break; 

case  MSG_PROCID: 

procID  =  Chanlnlnt ( fromprev)  ; 
if ( our ID  ==  0) 

ourlD  =  procID;  /*  we  now  have  an  ID  number  * 
else(  /*  we  already  have  an  ID  */ 

ChanOutlnt (tonext, MSG_PROCID) ; 

ChanOutlnt (tonext , procID) ; 

) 

break; 

case  MSG_ANGLE:  /*  work  packet  */ 

procID  =  Chanlnlnt (fromprev) ; 
len  =  Chanlnlnt (fromprev) ; 

Chanln (f romprev, Sangle, len) ; 

if (procID  ==  ourlD)  /*  message  is  for  us  */ 
chan  =  tolocal; 

else  /*  message  is  not  for  us  */ 

chan  =  tonext; 

ChanOutlnt (chan, MSG_ANGLE)  ; 

ChanOutlnt (chan, procID)  ; 

ChanOutlnt (chan, len)  ; 

ChanOut (chan, Sangle,  len)  ; 
break; 

case  MSG_WORK:  /*  work  parameters  */ 

procID  =  Chanlnlnt (f romprev) ; 
len  =  Chanlnlnt (f romprev) ; 
buff  =  (char  * ) malloc (len)  ; 

Chanln (f romprev, buff , len)  ; 
if (procID  ==  ourlD) 
chan  =  tolocal; 
else 

chan  =  tonext; 

ChanOutlnt (chan,MSG_WORK) ; 

ChanOutlnt (chan, procID)  ; 

ChanOutlnt (chan, len)  ; 

ChanOut (chan, buff, len) ; 

free (buf  f ) ; 

break; 

case  MSG_TEST:  /*  test  message  */ 

procID  =  Chanlnlnt (fromprev) ; 
len  =  Chanlnlnt ( fromprev) ; 
if (procID  ==  ourlD) 
chan  =  tolocal; 
else 

chan  =  tonext; 

ChanOutlnt (chan, MSG_TEST) ; 

ChanOutlnt (chan, procID)  ; 

ChanOutlnt (chan,  len)  ; 
break; 
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default : 


) 


/*  used  to  receive  all  other  message 
types  -  they  are  stored  as  a  series 
of  bytes  */ 
procID  =  Chanlnlnt (fromprev) ; 
len  =  Chanlnlnt (fromprev) ; 

buff  =  (char  * ) malloc (len) ;  /*  allocate  space  */ 
Chanln (fromprev, buff , len) ;  /*  read  data  */ 

if (procID  ==  ourlD) 
chan  =  tolocal; 
else 

chan  =  tonext; 

ChanOutlnt (chan,msg) ; 

ChanOutlnt (chan, procID) ; 

ChanOutlnt (chan, len) ; 

ChanOut (chan, buff , len) ; 

free (buff);  /*  free  memory  */ 

break; 

) 

break; 
default : 

while  ( ! dead)  { 

/*msg  =  ChanlnChar (fromprev) ; */  /*  hang  */ 

) 

break; 


/*********************************************************************** 

* 

*  This  routine  is  used  by  the  throughput  process  to  pass 

*  data  through  from  an  input  channel  to  an  output  channel . 

*  it  does  so  in  blocks  of  up  to  MAXBUFFER  bytes.  Taken  from 

*  examples  provided  by  Levco. 

*/ 

void  passdata (inc,  outc,  len) 

Channel  *inc,  *outc; 
int  len; 


char  buffer [MAXBUFFER]  ; 
int  todo,  thislength; 


todo  =  len; 
while  (todo  >  0)  { 

thislength  =  todo>MAXBUFFER  ?  MAXBUFFER: todo; 
Chanln (inc, buffer, thislength) ; 

ChanOut (outc, buff er, thislength) ; 
todo  -=  MAXBUFFER; 


) 

/*********************************************************************** 

★ 

*  This  is  the  standard  buffer  process.  Note  that  although 

*  six  buffers  are  used,  only  one  copy  of  the  code  is 

*  required. 
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*/ 

int  buffer  (procdesc,  toBuffer,  frornBuffer) 
int  procdesc; 

Channel  *toBuffer,  *fromBuffer; 


int  msgID,  procID,  msgLength; 
char  *msg; 


while  (TRUE)  ( 

msgID  =  Chanlnlnt (toBuffer) ; 

procID  =  Chanlnlnt (toBuffer)  ; 

msgLength  =  Chanlnlnt (toBuffer) ; 

if  (msgLength  !=  0) {  /*  if  message  length 

Chanln  will  not  work 
msg  =  (char  *) malloc (msgLength) ; 

Chanln (toBuffer,  msg,  msgLength); 


} 


=  0  then 

ii  *  j 


ChanGutlnt (frornBuffer, 
ChanOutlnt  (frornBuffer, 
ChanOutlnt  (frornBuffer, 
if  (msgLength  !=  0){ 
ChanOut (frornBuffer, 
f  ree (msg) ; 


msgID) ; 
procID) ; 
msgLength) ; 

msg,  msgLength) ; 


1 


1 


/★★I*****'*********************-******-***************************'********'** 

* 

*  This  process  provides  timing  data  using  the  hiah  resolution 

*  timer  (1  nsec)  -  it  acts  like  a  stop  watch  (on/of f /on ... )  and 

*  is  run  at  high  priority. 

*/ 

int  stopwatch (procdesc, inc, outc) 
int  procdesc; 

Channel  *inc,  *outc; 

{ 

int  start, stop; 

int  toggle; 


while  (TRUE) { 


toggle  =  Chanlnlnt (inc) ; 

/* 

start  timing  */ 

start  =  TimeO; 

/* 

read  start  time  */ 

toggle  =  Chanlnlnt (inc) ; 

/* 

stop  timing  */ 

stop  =  Time(); 

/* 

read  stop  time  */ 

ChanOutlnt (outc, stop-start)  ; 

/* 

return  elapsed  time 

) 


■k 


The  heart  of  the  workfarm  code.  This  is  where  the  actual 
work  (computations)  is  done. 
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/*  global  va 

riables  */ 

int 

turnf lag; 

/*  turning  point,  bottom  or  surface 

int 

done  ; 

/♦  stop  work  flag  */ 

int 

cnt  ; 

int 

bcount ; 

double 

tcime; 

/*  travel  time  at  one  point  along  ra; 

double 

omega ; 

double 

epsilon; 

double 

scalesave; 

double 

permScale; 

position 

*  raypath; 

/*  ray  path  coordinates  */ 

gauss  beam 

*beampath; 

/*  beam  parameters  */ 

complex 

* gamma ; 

/*  partial  beam  result®  */ 

int 

‘branch; 

/*  sqrt  branch  for  q(s)  */ 

screen  pos 

♦screen; 

/*  Mac  screen  coordinates  of  ray  pat 

source  posn 

source ; 

/*  source  position  */ 

position 

receiver [ 1 ] [ 1 

];/♦  receiver  position  */ 

sound  profile  profile ; 

/*  SS  profile  data  */ 

bottom  profile  bprofile; 

/*  bottom  bathymetry  data  */ 

work  params 

work; 

/*  various  parameters  */ 

beam  params 

beampa  ram; 

/*  Gauss,  beam  parameters  */ 

ray_resul t 

f in ray; 

/*  end  point,  etc  of  ray  */ 

screen_data 

s; 

/*  size,  etc  of  Mac  screen  ♦/ 

Channel 

♦outChan; 

int  render (procdesc, tolo 

cal , 

fromlocal,  totimer,  fromtimer) 

int  procdesc; 

Channel  *tolocal,  *fromlocal 
/ 

,  * totimer,  * fromtimer; 

V 

int 

procID; 

stat  ’ 

int  numtimes  =  0; 

int 

len 

; 

int 

thetime 

; 

int 

temp; 

double 

theta; 

outChan  =  fromlocal; 

/*  allocate  the  memory  required  L<_>  store  all  the  data  */ 
raypath  =  (position  *)  malloc  (sizeof  (position)  *maxraypoir.ts)  ; 
beampath  =  (gauss_beam  *)malloc (sizeof (gauss_beam) *maxraypoints ) 
gamma  =  (complex  *) malloc (si zeof (complex ) *maxraypoints ) ; 
branch  =  (int  *) malloc (sizeof (int ) *maxraypoints )  ; 
screen  =  (screen_pos  *) malloc (sizeof (screen_pos ) *maxraypoints ) ; 

while  (TRUE)  { 

switch (Chanlnlnt (tolocal )  )  { 

case  MSG_ANGLE:  /*  work,  packet  */ 

procID  =  Chanlnlnt {tolocal ) ; 
len  =  Chanlnlnt (tolocal ) ; 

Chanln (tolocal, Stheta, len) ; 
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cent  rol (theta ) ; 
break ; 

case  MSG_SSPROFI LE  :  /*  S3  profile  data  */ 

procID  =  Chanlnlnt (tolocal ) ; 
ler.  =  Chanlnlnt (tolocal ) ; 

Chan In  (tolocal, &profile, len) ; 
break ; 

case  MSG_BPROFI LE :  /*  bottom  data  */ 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt (tolocal ) ; 

Chanln  (tolocal, &bprofile, len) ; 
break; 

case  MSG_BEAM:  /*  Gauss,  beam  pa  rams  */ 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt (tolocal ) ; 

Chanln (tolocal, Sbeamparam, len) ; 
break ; 

case  MSG_SOURCE:  /*  source  position  * i 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt  (tolocal) ; 

Chanln (tolocal, Ssource, len) ; 
break ; 

case  MSG_RECEIVER :  /*  receiver  position  * i 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt  (tolocal) ; 

Chanln (tolocal, Sreceiver[0] [0] , len) ; 
break; 

case  MSG  SCREEN:  /*  screen  parameters  * / 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt (tolocal) ; 

Chanln (tolocal, &s, len) ; 
break; 

case  MSG_WORK:  /*  rnisc.  parameters  *! 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  Chanlnlnt (tolocal) ; 

Chanln (tolocal, Swork, len) ; 
permScale  =  wor k . sea leMax ; 
break ; 

case  MSG_TEST :  /*  test  message  */ 

procID  =  Chanlnlnt (tolocal ) ; 
len  =  ChanI nlnt ( tolocal ) ; 
testmsg (procID, len) ; 
break ; 

default : 

/*  none  !  */ 
break ; 


} 

/************ 


***********  +  *****  *********************************„ **„**», 


Send  back  test  message  -  this  is  mainly  used  to  send 
back  various  'things' .  It  is  useful  during  software 
development  to  send  back  addresses  of  variables,  etc 


testmsg (procID, len) 


97 


int  procID,  len; 

t 

ChanOut Int (outChan, MSG_TEST) ; 

ChanOutlnt (outChan, procID)  ; 

ChanOutlnt (outChan,  len)  ; 

/**•»***★**********★•»*■***★'****★**★★*•*★*■*******★****★*★******  +  ***■****** 

*  This  function  coordinates  the  various  things  that  need 

*  to  be  done. 


void  control (theta) 

double  theta;  /*  theta  in  degrees  */ 

int  len; 

register  i; 


work . scaleMax  =  permScale; 

/* 

store  scaleMax  so  we  don't  * 

scalesave  =  permScale; 

/* 

lose  it  */ 

f inray . bottom  hits  =  0; 

f inray . surface  hits  =  0; 

RKF 4  (theta ) ; 

/* 

do  ray  trace  first  * 

Gauss Summ ( ) ; 

/* 

then  the  Gauss,  beam  Stuff  * 

/*  convert  raypath  to  Mac 

screen 

coordinates  */ 

for  (i  =  0;  i  <  cnt;  i  +  +)  { 

screen[i).h  =  HRaypos ( raypath [ i )  . x )  ; 
screen[i].v  =  VRaypos ( raypath [ i j  .  z )  ; 


) 


/*  send  the  data  back  to  the  Mac  (via  a  few  buffers  *! 
len  =  cnt'sizeof (screen_pos) ; 

ChanOutlnt  (outChan,  MSG_RAY__DATA)  ; 

ChanOutlnt (outChan, ourlD)  ; 

ChanOutlnt (outChan,  len)  ; 

ChanOut (outChan, screen,  len)  ; 

i 

/************************************-**********************-*rt******** 

*  Converts  depth  (z)  value  into  vertical  pixel  location 

*  for  Mac  display 
*/ 

int  VRaypos ( z ) 
double  z; 

( 

return(s.top  +  s . RayPixelsV* (z/s .RayScaleV) ) ; 

! 


/  * 

*  Converts  range  (x)  value  into  horizontal  pixel  location 

*  for  Mac  display 

*  / 

int  HRaypos ( r ) 
double  r; 

( 

return (s . left  +  s . RayPixelsH*  (r/s . RayScaleH)  )  ; 
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*  Computes  c,  c',  c1'  for  a  given  depth  value  using  cubic  spline 


*/ 

c  ( z , svel ) 
double 

z  ; 

/*  given 

depth 

*  / 

sound  speed 
{ 

int 

*  svel ; 

/*  used 

to  store  return  values  */ 

i  =  0; 

int 

endflag 

=  0; 

double 

w,  net  w; 

double 

del_z; 

int 

step; 

double 

endgrad 

=  0.016; 

/* 

typical  depth  dependent 

/* 

gradient  */ 

/*  find  upper  index  for  this  depth  in  SSP  data  */ 
i  =  0; 

while  (prof ile . 2 [ i ]  <=  z){ 


++  i ; 

if  ( i==prof ile . npt s ) (  /*  z  value  is  past  the  last  */ 

i  -=  1;  /*  tabulated  value  */ 

endflag  =  1; 
break; 

} 

) 

/*  evaluate  speed  of  sound  */ 
del_z  =  profile. z[i]  -  profile. z [ i — 1 ] ; 
if  (endflag  ==  1) 
w  =  1.0; 
else 

w  =  (z  -  prof ile . z [ i — 1 ])/ del_z; 
not_w  =  1.0  -  w; 

/*  speed  of  sound  */ 

svel->c  =  (not_w) *prof ile . c [ i-1 ]  +  w*prof ile . c [i ] 

+  del_z*del_z* (prof ile . a [i ] * (pow (w, 3 . 0 )  -  w) 

+  profile . a [i-1 ] * (pow (not_w, 3 . 0)  -  not_w) ) ; 
if  (endflag  ==  1){  /*  use  linear  interpolation  */ 
del_z  =  z  -  profile. z  [i] ; 
svel->c  +=  del_z*endgrad; 
svel->g  =  endgrad; 
svel->gg  =  0.0; 
t 

else  { 

/*  speed  of  sound  gradient  */ 

svel->g  =  (prof ile . c [ i ]  -  prof ile . c [i-1 ]) /del_z 
+  del_z* (prof ile . a [i ] * ( 3 . 0*w*w  -  1) 

-  prof ile . a [i-1 ] * ( 3 . 0*not_w*not_w  -  1.0)); 

/*  second  derivative  of  speed  of  sound  */ 

svel->gg  =  6 . 0* (not_w*prof ile . a [i-1 ]  +  w*prof ile . a [i ] ) ; 

I 


/★***★**★★★★*★★★★*****★***************★*******★**★********★**★********★* 
*  Compute  bottom  depth  and  gradient  for  a  given  range  value 
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*/ 

bottomval  (r,bot) 

double  r;  /*  given  range  value  */ 

bottom_point  *bot;  /*  used  to  store  return  values  */ 

{ 

int  i  =  0; 

int  endflag  =  0; 

double  w,not_w; 
double  del_r; 
int  step; 

/*  find  upper  index  for  this  range  in  bottom  data  */ 
i  =  0; 

while  (bprofile . r  [i ]  <=  r) { 

++i  ; 

if  (i==bprof ile . npts) {  /*  range  is  past  last  */ 

i  -=  1;  /*  tabulated  value  */ 

endflag  =  1; 
break; 

} 

t 

del_r  =  bprof ile . r [ i ]  -  bprofile. r [i-1]  ; 

w  =  (r  -  bprofile . r [i-1 ]) /del_r; 
not_w  =  1.0  -  w; 

if (endflag  ==  1)(  /*  bottom  value  is  set  to  last  */ 

bot->z  =  bprof ile . z [i ] ;  /*  tabulated  value  ie.flat  bottom  */ 

bot->g  =  0.0;  /*  gradient  is  zero  */ 

} 

else  { 

bot->z  =  (not_w) *bprof ile . z [i-1 ]  +  w*bprof ile . z [ i ] 

+  de.l_r*del_r*  (bprof  ile .  a  [i]  *  (pow  (w,  3 . 0 )  -  w) 

+  bprof ile . a [ i-1 ] * (pow (not_w, 3 . 0 )  -  not_w) ) ; 

bot->g  =  -1 . 0*  (  (bprofile. z [i]  -  bprof ile . z [i-1 ]) /del_r 
+  del_r* (bprof ile . a [i } * (3 . 0*w*w  -  1) 

-  bprof ile . a [i-1 ] * ( 3 . 0*not_w*not_w  -  1.0))); 

) 

} 

/★a*******1************************************************************** 
*  Evaluates  ray  equation  and  computes  p's  q' s  and  travel  time 


*/ 

fcn(h,z,b,misc,k) 
double  h, z; 

gauss_beam  *b; 

extra_stuff  *misc; 

double  k  [  5 ] ; 

( 

souncI_speed  svel; 

bottom_point  bot; 

double  temp, c2, cosine; 


/*  compute  z  value  */ 

if  (z  <  0.0)  /*  this  shouldn't  happen  */ 
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1.0; 


c (0 . 0, Ssvel ) ; 
else 

c ( z, isvel ) ; 

temp  =  pow ( 1 . 0/ (misc->a*svel . c) , 2 . 0 )  - 

if (temp  <  0 . 0) ( 

turnflag  =  turn_pt;  /*  turning  point  ! ! !  */ 

kfzval]  =  -1.0; 

) 

else 

k[zval]  =  h*sqrt (temp) ; 

/*  compute  p's  and  q's  */ 
c2  =  svel . c*svel . c; 
cosine  =  cos (misc->angle)  ; 

k[plval]  =  h*  (-1 . 0*svel .gg*b->ql/ (c2*cosine) ) ; 
k[qlval]  =  h* (svel . c*b->pl/cosine) ; 
k[p2val]  =  h* (-1 . 0*svel . gg*b->q2/ (c2*cosine) ) ; 
k[q2val]  =  h*  (svel . c*b->p2 /cosine)  ; 

} 

^*******************************************************'*'**********k***k 

*  Set  up  initial  parameters. 

*/ 

void  setup (svel,misc, beam) 
sound_speed  *svel; 

extra_stuff  *misc; 

gauss_beam  *beam; 

( 

if (misc->angle  ==0.0)  /*  a  little  fudge  required  */ 

misc->angle  +=  0.001;  /*  for  transputer  version  of  code  */ 

misc->angle  =  misc->angle*PI /I 80 . 0 ;  /*  convert  degrees  */ 

/*  to  radians  */ 

misc->a  =  cos (misc->angle) /svel->c;  /*  Snell's  constant  */ 

/*  determine  initial  direction  of  ray  */ 

if  ( (misc->angle  <  0.0)  ||  (misc->angle  ==  0.0  &&  svel->g  <  0.0)) 

misc->direction  =  down; 
else 

misc->direction  =  up; 
misc->angle  =  f abs (misc->angle)  ; 

/*  Gauss,  beam  I.C.'s  */ 

omega  =  2 . 0*PI*source . frequency; 

beam->pl  =  1.0; 

beam->ql  =  0.0; 

beam->p2  =  0.0; 

beam- ^-q2  =  1.0; 

/*  pick  optimum  estimate  for  epsilon  */ 
switch (beamparam. IC) { 

case  fillbeams;  /*  space  filling  beams  */ 

epsilon  =  2 . 0*svel->c*svel->c/ (omega* 

work . angle_incr*work . angle_incr ) ; 
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break; 

case  minwidthbeams :  /*  min.  width  beams  */ 

epsilon  =  svel->c* receiver [ 0 ][ 0 ]. x; 
break; 

case  cervenybeams :  /*  not  implemented  yet  */ 

break ; 


1 


} 

/*********************************************************************** 
*  Handle  turning  points  and  surface  reflections 

*/ 


void  turnpoint ( ray, misc, beam) 
position  *ray; 

extra_stuff  *misc; 

gauss_beam  *beam; 


{ 


sound_speed 

double 

double 

bottom_point 

int 


svel,  refl_svel; 

temp_z, temp_angle, temp_r; 

ref l_angle, delta_z, delta_r, step, same_g; 

bot  ; 

i; 


switch (turnf lag) { 

case  surf_ref 1 :  /*  surface  reflection  */ 

c (ray->z, &svel) ; 

temp_z  =  ray->z,  temp_angle  =  misc->angle; 
c (0 . 0, &refl_svel) ; 

ref l_angle  =  fabs (acos (cos (temp_angle) *ref l_svel . c/svel . c ) ) ; 
delta_r  =  fabs ( (sin (temp_angle)  -  sin (ref l_angle) ) / 
<misc->a*svel . g) ) ; 

if  (ray->x  +  delta_r  <  receiver [ 0 )[ 0 ]. x) { 
ray->z  =  0.0; 
f inray . surf ace_hits  +=  1; 
tpq (delta_r, misc->angle, Ssvel,  beam) ; 
increment ( ) ; 

Gamm(-1 . O*misc->angle*misc->direction, &svel , beam) ; 
ray->x  +=  delta_r; 

misc->direction  =  -1 . O*misc->direction; 
misc->angle  =  refl_angle; 
raypath[cnt]  =  *ray; 
beampath [cnt ]  =  *beam; 
sqrtBranch ( ) ; 

} 

if  ( ray->x  +  delta_r  <  receiver [0] [0] .x) { 

/*ttime  +=  fabs (delta_r/ (svel . c*cos (misc->angle) ) ) ; */ 
tpq (delta_r,misc->angle, &svel, beam) ; 
increment ( ) ; 

Gamm (-1 . O*misc->angle*misc->direction, Ssvel, beam) ; 

ray->z  =  temp_z,  misc->angle  =  temp_angle; 

ray->x  +=  delta_r; 

raypath[cnt]  =  *ray; 

beampath [cnt]  =  *beam; 

sqrtBranch ( ) ; 

) 

else  { 
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c (ray->z, Ssvel) ; 

delta_r  =  receiver [ 0 ][ 0 ]. x  -  ray->x; 
misc->angle  =  f abs (asin (sin (misc->angle)  + 
delta_r*misc->a*svel .g) ) ; 
delta_z  =  f abs ( (cos (misc->angle) /misc->a  - 
svel . c) / svel . g)  ; 

tpq (delta_r, misc->angle, & svel, beam) ; 
increment ( ) ; 

Gamm (-1 . O*misc->angle*misc->direction, &  svel , beam) ; 

ray->z  +=  misc->di rection*delta_z,  ray->x  +=  delta_r; 

raypathfcnt]  =  *ray; 

beampath [cnt ]  =  *beam; 

sqrtBranch ( ) ; 

done  =  true; 

) 

break; 

case  bott_refl:  /*  Bottom  reflection  */ 
bottom_step ( ray,  misc,  beam) ; 
break; 

case  turn_pt :  /*  turning  point  */ 
c (ray->z, Ssvel) ; 

same_g  =  svel.g;  /*  use  same  g  throughout  for  approx.  */ 
temp_z  =  -ray->z,  temp_angle  =  misc->angle; 
delta_z  =  fabs ( ( (1 . 0/misc->a)  -  svel.c) /same_g) ; 
delta_r  =  fabs (sin (misc->angle) / (misc->a*same_g) ) ; 
if  (delta_r  ==  0.0) {  /*  zero  start  condition  */ 

ray->z  +=  misc->direction* 0 . 05; 
c (ray->z, &refl_svel) ; 

misc->angle  =  fabs  (acos  (cos  (temp__angle) 

*ref l_svel . c/ svel . c) ) ; 

delta_r  =  fabs ( (sin (temp_angle) -sin (misc->angle) ) / 
(misc->a*same_g) ) ; 
ray->x  +=  delta_r; 

tpq (delta_r,misc->angle, &svel,beam) ; 
increment ( ) ; 

Gamm ( -1 . 0*misc->angle*misc->di recti on, Ssvel , beam) ; 
raypath[cnt]  =  *ray; 
beampath tent ]  =  *beam; 
sqrtBranch ( ) ; 

} 

else  { 

if  ( (ray->x  +  delta_r)  <  receiver  [ 0 ]  [ 0 ]. x) { 
ray->z  +=  misc->direction*delta_z; 
ray->x  +=  delta_r; 
bottomval (ray->x,  &bot)  ; 
if  (ray->z  >  bot.z){ 

ray->z  -=  misc->direction*delta_z; 
ray->x  -=  delta_r; 
bottom_step (ray, misc, beam) ; 
break; 

) 

else  { 

tpq (delta_r,misc->angle, Ssvel, beam) ; 
increment ( ) ; 

Gamm(-1 . O*misc->angle*misc->direction, Ssvel, beam) 
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misc->angle  =  0.0; 

misc->direction  =  -1 . O*misc->direction; 
raypath[cnt]  =  *ray; 
beampath [cnt ]  =  *beam; 
sqrtBranch ( )  ; 

} 

} 

if  ((ray->x  +  delta_r)  <  receiver [0] [0] .x) ( 
bottomval (ray->x  +  delta_r ,  &bot )  ; 
if  (temp_z  >  bot.z){ 

bottom_step (ray, misc, beam) ; 
break ; 

} 

else  { 

ray->z  =  temp_z; 
ray->x  +=  delta_r; 

tpq ( del ta_r, misc ->angle, &  svel , beam) ; 
increment ( ) ; 

Gamm (-1 . O*misc->angle*misc->direction, &svel , beam) ; 

misc->angle  =  temp_angle; 

rrypathfcnt]  =  *ray; 

beampath [cnt]  =  *beam; 

sqrtBranch ( )  ; 

) 

} 

else  { 

c (ray->z, Ssvel) ; 

delta_r  =  receiver [0] [0] .x  -  ray->x; 
misc->angle  =  fabs (asin (sin (misc->angle)  + 
delta_r*misc->a*same_g) ) ; 
delta_z  =  fabs (( (cos (misc->angle) /misc->a)  - 
svel . c) /same_g) ; 

tpq(delta_r,misc->angle, &svel,beam) ; 
increment ( ) ; 

Gamm(-1 . O*misc->angle*misc->direction, &svel,beam) ; 

ray->z  +=  misc->direction*delta_z,  ray->x  +=  delta_r; 

raypathfcnt]  =  *ray; 

beampath [cnt]  =  *beam; 

sqrtBranch ( ) ; 

done  =  true; 

} 

) 

break; 

)  /*  end  of  switch  */ 

} 


/★**★★★*★*★★**★********★****★★★*★*********★★**★*★****★*★******★*★*★**★** 
*  Steps  ray  path  to,  and  out  from  bottom. 

*/ 

bottom_step (ray,  misc,  beam) 
position  *ray; 
extra_stuff  *misc; 
gauss_beam  *beam; 

{ 

int  i; 
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sound_speed  svel , ref l_svel ; 

bottom_point  bot; 

double  step; 

double  delta_z , delta_r , refl_angle; 

c (ray->z, &svel) ; 

step  =  0.0,delta_z  =  0.0,delta_r  =  0.0; 
refl_angle  =  misc->angle; 
bottomval (ray->x,  &bot)  ; 
step  =  0.5* (bot. z  -  ray->z) ; 
i  =  0; 

while  (fabs (bot . z- (ray->z+misc->direction*delta_z) )  > 
work.bottom_tolerance) { 

++i  ; 

c (ray->z+misc->direction* (delta_z+step) , &refl_svel) ; 
refl_angle  =  fabs (acos (cos (misc->angle) *refl_svel . c/svel . c) ) 
delta_r  =  fabs ( (sin (misc->angle )  -  sin (ref l_angle) ) / 
(misc->a*svel . g) ) ; 
bottomval ( ray->x  +  delta_r , &bot ) ; 

if  ((bot.z  >  (ray->z+misc->direction* (delta_z+step) ) ) ) 
delta_z  +=  step; 
else 

step  =  0.5*step; 

} 

/*  step  ray  into  bottom  */ 

ray->z  +=  misc->direction*delta_z,  ray->x  +=  delta  r; 
tpq (delta_r,misc->angle, &svel, beam) ; 
increment ( ) ; 

Gamm(-1 . O*misc->angle*misc-->direction, &svel,beam) ; 
misc->angle  =  refl_angle; 

c (ray->z, &refl_svel) ;  /*  compute  svel  at  bottom  */ 

if  ( (misc->direction  =  down)  &&  ( (refl_angle  + 

2 . 0*atan (bot . g) )  >  00)) 
misc->direction  =  -1 . 0*misc->direction; 
f inray . bottom_hits  +=  1; 
raypath(cnt]  =  *ray; 
beampath [cnt]  =  *beam; 
sqrtBranch ( ) ; 

/*  step  ray  to  next  point  */ 

misc->angle  +=  2 . 0*atan  (bot . g) ;  /*  new  angle  after  */ 

/*  bottom  reflection  */ 

misc->angle  =  f abs (misc->angle) ; 
if  (misc->angle  >  PI/2.0) 
done  =  true; 

/*  compute  new  'a'  value  */ 
misc->a  =  cos (misc->angle) /ref l_svel  c; 
delta_z  =  10.0; 
c  (ray->z, Ssvel) ; 

c (ray->z+misc->di rection*delta_z, &refl_svel)  ; 
refl_arigle  =  fabs  (acos  (cos  (misc->angle)  *refl_svel .  c/svel .  c)  )  ; 
delta_r  =  fabs ( (sin (misc->angle)  -  sin (refl_angle) ) / 
(misc->a*svel . g) ) ; 
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ray->z  +=  misc->direction*delta_z; 
ray->x  +=  delta_r; 

tpq (delta_r,misc->angle, &svel,  beam)  ; 
increment ( ) ; 

Gamm(-1 . O*misc->angle*misc->direction, &svel,beam) ; 

misc->angle  =  refl_angle; 

raypath[cnt]  =  *ray; 

beampath [cpl]  -  -beam; 

sqrtBranch ( ) ; 


/*********************************************************************** 

*  Compute  travel  time  and  p's  and  q' s  (tpq's)  when  using  linear 

*  approximation  method  (ie.  at  turning  points,  etc) 

*/ 

tpq (delta_r, angle, svel, beam) 
double  delta_r, angle; 

sound_speed  *svel; 
gauss_beam  *beam; 

{ 

double  temp__p,  cosine,  c2; 

cosine  =  cos (angle); 
c2  =  svel->c*svel->c; 

ttime  +=  delta_r/ (svel->c*cosine) ;  /*  compute  new  time  */ 

/*  compute  new  p's  &  q's  */ 
temp_p  =  beam->pl; 

beam->pl  +=  delta_r* (-1 . 0*svel~>gg*beam~>ql / (c2*cosine) ) ; 
beam->ql  +=  delta_r* (svel->c*temp_p/cosine) ; 
tempjp  =  beam->p2; 

beam->p2  +=  delta_ r* ( -1 . 0 *svel~>gg*beam->q2 / (c2*cosine) )  ; 
beam->q2  +=  delta_r* (svel->c*temp_p/cosine) ; 

} 

/★★★★*★*★***★★*★*★★★★★*★★★**★★★***★★*★*★★★*★*★★**★★*★*★*****★★*★★*★**★** 

*  The  Runge-Kutta-Fehlberg  algorithm 
*/ 

/*  Vars  Global  to  RKF  and  reducestep  routines  */ 


double 

hstart  = 

25.0;  /*  starting  step  size 

double 

h;  /* 

step  size  */ 

position 

ray;  /* 

single  ray  path  position  */ 

extra_stuf f 

misc; 

gauss_beam 

gb; 

void  RKF4 (theta) 

double 

theta; 

f 


double 

bottom_point 

sound_speed 

gauss_beam 

double 

double 

double 


k[7] [5] ; 

bot; 

svel; 

newb; 

newz; 

scale; 

err, zstep; 
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source. y,  ray.x 


source . x; 


misc. angle  =  theta; 
ray.z  =  source. z,  ray.y  = 
c  (ray. z, &svel) ; 
setup (Ssvel, &misc,  Sgb)  ; 
cnt  =  0; 
bcount  =  0; 
h  =  hstart; 
ttime  =  0.0; 
done  =  false; 
raypath[cnt]  =  ray; 
beampath [cnt ]  =  gb; 

Gamm (-1.0*misc.angle*misc.direction, ssvel ,  Sgb) ; 


while ( ! done) ( 
newz  =  ray . z ; 
newb  =  gb; 

f cn (h, newz, Snewb, Smisc, Sk [ 1 ] ) ;  /*  first  fen  evaluation  */ 
if  (k [1  ]  [ zval ]  <  0 . 0 )  { 
reducestep ( ) ; 
continue; 

} 

newz  =  ray.z  +  misc . direction*k [ 1 ] [ zval ] / 4 . 0 ; 
newb.pl  =  gb.pl  +  k  [  1  ]  [plval ] /'4 . 0 ; 
newb.p2  =  gb.p2  +  k [ 1 ] [p2val ] /4 . 0 ; 
newb.ql  =  gb.ql  +  k [ 1 ] [qlval ) /4 . 0; 
newb.q2  =  gb.q2  +  k[l][q2val]/4.0; 

fen (h, newz, Snewb, Smisc, &k[2] ) ;  /*  second  fen  evaluation  */ 
if  (k [2]  [zval]  <  0 . 0)  { 
reducestep () ; 
continue; 

} 

newz  =  ray. z+misc. direction* 

(3. 0  *k [ 1 ]  [zval] +9. 0*k[2]  [zval] ) /32.0; 
newb.pl  =  gb.pl  +  (3 . 0*k [1] [plval] +9. 0*k [2] [plval] ) /32 . 0; 
newb . p2  =  gb.p2  +  ( 3 . 0*k [ 1 ] [p2val ] +9 . 0*k [2 ] [p2val ] ) /32 . 0 ; 
newb.ql  =  gb.ql  +  (3 . 0*k  [1]  [qlval]  *-9. 0*k  [2]  [qlval  ]  ) /32 . 0; 
newb . q2  =  gb.q2  +  (3.0*k[l][q2val]+9.0*k[2][q2val])/32.0; 
fen (h,  newz,  Snewb, Smisc, &k [ 3] ) ;  /*  third  fen  evaluation  */ 
if  (k [3] [zval]  <  0.0) { 
reducestep ( ) ; 
continue; 

} 

newz  =  ray. z+misc. direction* (1932 . 0*k [1] [zval] - 
7200 . 0*k [2] [ zval ] + 

7296 . 0*k [ 3] [zval] ) /21 97.0; 

newb.pl  =  gb.pl+(1932.0*k[l] [plval ] -7200 . 0*k [2 ] [plval] + 
7296 . 0*k [3] [plval] ) /2197.0; 

newb. p2  =  gb.p2+ (1932. 0*k[l] [p2val] -7200. 0*k [2] [p2val] + 
7296.0*k[3] [p2val] ) /2197. 0; 

newb.ql  =  gb . ql+ ( 1 932 . 0*k [ 1 ] [qlval ] -7200 . 0*k [ 2 ] [qlval ] + 
7296 . 0*k  [3]  [qlval] )/2197.0; 

newb . q2  =  gb . q2+ ( 1932 . 0*k [ 1 ] [q2val ] -7200 . 0*k [2 ] [q2val ] + 
7296. 0*k [3] [q2vnl ) )/2197.0, 

f cn (h, newz, Snewb, Smisc, Sk [ 4 ]) ;  /*  fourth  fen  evaluation  */ 
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if  (k  [  4 ]  [zval]  <  0.0) { 
reducestep ( ) ; 
continue; 

1 

newz  =  ray.z  +  misc . direction* ( 4 39 . 0*k [ 1 ][ zval ] /21 6 . 0  - 
8.0*k[2] [zval]  +  3680 . 0*k [ 3] [ zval ] /51 3 . 0  - 
845 . 0*k [4 ]  [zval] /4104.0) ; 
newb.pl  =  gb.pl  +  (439.0*k[l] [plval ] /21 6 . 0  - 

8 . 0*k [2]  [plval]  +  3680. 0*k[3]  [plval] /513.0  - 
845 . 0*k [4 ] [plval] / 41 04.0) ; 
newb . p2  =  gb ,p2  +  (439.0*k[l][p2val]/216.0  - 

8 . 0*k [2] [p2val]  +  368 0 . 0*k [ 3 ] [p2val ] /51 3 . 0  - 
845 . 0*k [4] [p2val] /4 104.0) ; 
newb . ql  =  gb . ql  +  (439. 0*k[l] [qlval]/2l6.0  - 

8 . 0*k[21 [qlval]  +  3680 . 0*k [ 3] [ql val ] /51 3 . 0  - 
845 . 0*k [4 ] [qlval] /41 04.0) ; 
newb . q2  =  gb.q2  +  ( 439 . 0*k [ 1 ]  [q2val ] /21 6 . 0  - 

8. 0*k[2] [q2val]  +  3680 . 0*k [ 3] [q2val ] /51 3 . 0  - 
845 . 0*k [4 ] [q2val] /4104.0) ; 

fen (h, newz, Snewb, &misc, &k[5] )  ;  /*  fifth  fen  evaluation  */ 
if  (k [5] [zval]  <  0.0) { 
reducestep ( ) ; 
continue; 

) 

newz  =  ray.z  +  misc. direction* (-8 . 0*k [1]  [zval] /27 . 0  + 

2 . 0*k[2] [zval]  -  3544 . 0*k [ 3] [ zval ] /2565 . 0  + 

1859. 0*k[4]  [zval] /4104.0  -  11 . 0*k  [5]  [zval ] /4 0 . 0) ; 
newb.pl  =  gb.pl  +  (-8 . 0*k [1 ] [plval] /27 . 0  + 

2.0*k[2] [plval]  -  3544. 0*k[3] [plval] /2565.0  + 

1859. 0*k[4] [plval] /4104.0  -  11 . 0*k [5] [plval] /40 . 0) 
newb.p2  =  gb.p2  +  (-8 . 0*k[l] [p2val] /27 . 0  + 

2 . 0*k [2] [p2val]  -  354 4 . 0*k [ 3] [p2val ] /2565 . 0  + 
1859.0*k[4]  [p2val] /4104 . 0  -  11 . 0*k l 5]  [p2val] /40 . 0) 
newb.ql  =  gb.ql  +  (-8 . 0*k [1 ] [qlval ] /27 . 0  + 

2.0*k[2] [qlval]  -  3544 . 0*k[3] [qlval] /2565. 0  + 
1859.0*k[4] [qlval] /4104 . 0  -  11 . 0*1 r5] [qlval] /40 . 0) 
newb.q2  =  gb.q2  +  (-8 . 0*k [ 1 ] [q2val] /27 . 0  + 

2 . 0*k [2] [q2val]  -  3544 . 0*k [3] [q2val] /2565. 0  + 
1859.0*k[4] [q2val] /4104 . 0  -  11 . 0*k [ 5 ] [q2val ] /4 0 . 0 ) 
f cn (h,  newz,  Snewb, &misc,  &  k [6] ) ;  /*  final  fen  evaluation  */ 
if  (k [6] [zval]  <  0.0) { 
reducestep  { ) ; 
continue; 

} 

/*  compute  error  estimate  */ 

err  =  f abs (k [ 1 ] [zval ] /360 . 0  -  128 . 0*k [3] [zval] /4275 . 0  - 
2197. 0*k[4] [zval] /75240.0  + 
k[5]  [zval] /50.0  +  2 . 0*k[6] [zval] /55. 0) ; 

if  (err/h  <  work . tolerance) {  /*  error  is  acceptable  */ 

/*  compute  new  values  of  z,  p  and  q  */ 
zstep  =  misc . direction* (25 . 0*k [ 1 ][ zval ] /216 . 0  + 

1408 . 0*k[3] [zval] /2565. 0  + 

2l97.0*k[4] [zval] /4104.0  -  k [ 5] [ zval ] /5 . 0 ) ; 
gb.pl  +=  (25. 0*k[l] [plval] /216.0  + 
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1408.0*k[3] [plval] /2565.0  + 

2197.0*k[4] [plval]/4104.0  -  k [ 5 ] [plval ] / 5 . 0 ) ; 
gb.p2  +=  (25.0*k[l] [p2val] /216.0  + 

1408.0*k[3] [p2val] /2565.0  + 

2197.0*k[4] [p2val ] /4 1 04 . 0  -  k [ 5 ] [p2val ] / 5 . 0 ) ; 
gb.ql  +=  <25 . 0*k [ 1 ] [qlval] /216.0  + 

1408.0*k[3] [qlval ] /2565 . 0  + 

2197. 0  *  k  [  4 ]  [qlval] /4104.0  -  k [5]  [qlval] /5 . 0) ; 
gb.q2  +=  (25.0*k[l] [q2val] /216 . 0  + 

1408 . 0*k [3] [q2val j /2565 . 0  + 

2197.0*k[4] [q2val] /4104.0  -  k [5] [q2val] /5 . 0) ; 


ray . z  +=  zstep; 
ray.x  +=  h; 

if  (ray . z  <  0.0) {  /*  check  if  we  surfaced  */ 

turnflag  =  surf_refl; 
ray.z  -=  zstep;  /*  step  back  */ 


ray.x  -=  h; 


reducestep ( ) ;  /*  reduce  step  size  */ 

continue; 


) 

bottomval (ray .x, &bot) ;  /*  check  if  we  bottomed  out  */ 
if  ((ray.z  >  bot.z)  &&  (fabs(ray.z  -  bot.z)  > 
work.hottom_tolerance) ) { 
turnflag  =  bott_refl; 
ray.z  -=  zstep;  /*  step  back  */ 
ray.x  -=  h; 

reducestep () ;  /*  reduce  step  size  */ 

if  (misc. angle  >  PI/2.0) 
done  =  true; 

‘ continue; 

1 

ttime  +=  h/ (svel . c*cos (misc . angle) )  ; 

if  (ray.x  >=  receiver [ 0 ][ 0 ]. x)  /*  past  receiver  range  */ 
done  =  true; 
increment ( ) ; 

Gamm(-1 . 0*misc . angle*misc . direction, &svel,  &gb) ; 
raypath[cnt]  =  ray; 
beampath [cnt]  =  gb; 
sqrtBranch ( ) ; 
c (ray. z, Ssvel) ; 

misc. angle  =  acos (svel . c*misc . a) ; 


/*  scale  step  size  */ 

scale  =  0 . 84*pow ( (work . tolerance*h/err )  ,  0 . 25)  ; 
if  (scale  <  work . scaleMin)  scale  =  work . scaleMin; 
if  (scale  >  work . scaleMax)  scale  =  work . scaleMax; 
h  =  h*scale; 

if  (h  <  1.0) {  /*  stap  is  kind  of  smal_  !?!*/ 

reducestep ( ) ; 
continue; 

) 

if  ((ray.x  +  h)  >  receiver [ 0 ][ 0 ]. x)  /*  don't  step  past  */ 

/*  receiver  */ 

h  =  receiver [0] [0] ,x  -  ray.x; 
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) 

f inray . rayend  =  ray;  /*  return  final  ray  position  */ 
f inray. angle  =  misc. angle; 
f inray. time  =  ttime; 

} 

/********************************************,**********,,****,*****»*,» 

*  Reduces  integration  step  size 
*/ 

reducestep ( ) 

{ 

work . scaleMax  =  1.0; 
h  =  h/2.0; 
if  <h  <  hstart) { 

work . scaleMax  =  scalesave;  /*  restore  scalemax  to  */ 

/*  original  value  */ 

turnpoint  (& ray , &misc,  igb)  ;  /*  go  to  turning  point  appro;-:.  */ 
h  =  hstart; 

) 

} 

*  Increments  array  index  used  for  storing  ray  path.  Also 

*  checks  if  we  have  too  many  points 
*/ 

increment ( ) 

{ 

cnt++ ; 

if  (cnt  >=  maxraypoints) 
done  =  true; 

} 

/**********************************************»****,****«***,**,»»*,,», 

*  Checks  for  p(s)  imag.  axis  crossings. 

* 

sqrtBranch ( ) 

( 

double  ql,q2; 

if  (cnt  ==  0 ) 

branch [0]  =  0; 
else  { 

ql  =  beampath [cnt-1 ] . q2 ,  q2  =  beampath [ cnt ] . q2 ; 
if ( ( (ql  <  0 . 0 ) &  & (q2  >  0.0))  ||  ( (ql  >  0.0)&&(q2  <  0.0)))  ( 

branch [bcount ]  =  cnt;  /*  store  index  where  crossing  */ 

/*  occurred  */ 

bcount i f; 

} 

) 

} 

/*********************************************************************** 

★ 

*  Complex  math  functions 

*/ 

complex  comp(x,y)  /*  assign  reals  to  type  complex  */ 

double  x,y; 

( 

complex  z ; 
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z  .  re  =  x  ; 
z  .  i  m  =  y  ; 
return  (  z ) ; 

} 

complex  cadd(x,y)  /*  complex  add  */ 

complex  x,y; 

{ 

complex  z; 

z  .  re  =  x  .  re  +  y  .  re ; 
z  .  i  m  -  x  .  i  mi  +  y  .  i  m  ; 
return  ( z) ; 

} 

complex  cmult(x,y)  /*  complex  multiply  */ 

com.pl  ex  x,y; 

{ 

complex  z; 

z  .  re  =  x.re*y. re  -  x.im*y.im; 
z . im  =  x.re*y.im  +  x.im'y.re; 
return  (  z ) ; 

) 

complex  cmultd(x,y)  /*  complex  times  a  constant  */ 

double  x; 

complex  y; 

I 

complex  z; 

z . re  -  y . re*x; 
z . im  =  y . im*x; 
return ( z ) ; 

} 

complex  cdiv(x,y)  /*  complex  division  */ 

complex  x,y; 

( 

complex  z ; 
double  den; 

if  (y.re  ==  0.0  &&  y.im  ==  0.0) {  /*  divide  by  zero  */ 

z.re  =  HUGE_VAL,  z . im  =  HUGE_VAL; 
return (z) ; 

) 

else  { 

den  =  y.re*y.re  +  v.im*v.im; 
z.re  =  (x.re*y.re  +  x . im*y . im) /den; 
z . im  =  (y.re*x.im  -  x . re*y . im) /den ; 
return ( z ) ; 

} 
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double  real  (x)  /*  return  real  part  */ 

complex  x; 

( 

return (x . re) ; 

1 

double  imag(x)  /*  return  complex  part  */ 

complex  x; 

{ 

return (x . im) ; 

} 

/  *  i »*t***'(nk******************^*****************************:***********>*’* 

*  Computes  part  of  Gauss,  beam  equation  at  same  time 

*  as  ray  path  is  computed. 

*  / 

Gamm (theta , svel , beam) 
double  theta; 

sound_speed  ‘svel; 
gauss_beam  ‘beam; 

( 

double  c2; 

double  cs,cn,tr,tz; 

complex  p, q, res , comp ( ) , cdiv ( ) , cmultd ( ) ; 

t  r  =  cos (theta ) ; 
tz  =  sin (theta)  ; 
c2  =  svel->c*svel->c; 
cs  =  svel->g*tz; 
cn  =  -1 . 0*svel->g*tr; 

p  =  comp (beam->pl, beam~>p2*epsilon)  ; 
q  =  comp (beam->ql , beam->q2*epsilon)  ; 
res  =  cdiv(p,q); 

gamma [ cnt ]. re  =  0 . 5*res . re*tr*tr  +  2 . 0*cn*tz*tr/o2  -  cs*tz*tz/c2; 
gamma [ cnt ]. im  =  0 . 5*res . im*tr*tr; 

[ 

/************************************************»***»****»»****»»*•*»»» 

*  Compute  Gauss,  beam  summation  at  particular  receiver  location 

*  / 

Gauss Summ ( ) 

f 

register  i; 

double  temp; 

complex  p, q, comp (), cdiv () ; 

for  (i  =0;  i  <  cnt;  i++) { 

beampath [ i ] . p2  =  beampath[iJ.p2*epsilon; 
beampath [ i ] . q2  =  beampath [ i ]. q2‘epsi Ion ; 

} 

I 

/**************************************************> r******************** 

*  Route  messages  back  to  Mac 

*/ 

int  feedback (procdesc, fromlocal,  fromnext,  toprev) 
int  procdesc; 

Channel  ‘fromlocal,  ‘fromnext,  ‘toprev; 
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char  *buff; 

int  msg, idx, len; 

int  procID, atime; 

Channel  *chan; 

static  int  toggle  =  0; 

ChanOutlnt (toprev, MSG_BOOTED) ;  /*  tell  host  we're  alive  */ 
while  (TRUE)  { 


/*  NOTE:  the  code  below  alternates  between  two  calls  to  */ 
/*  the  same  function  ProcAlt.  The  ProcAlt  function  tends  */ 
/*  to  favor  the  channel  given  first  in  the  argument  list  */ 
/*  if  both  are  ready  at  the  same  time.  The  two  calls  simply  */ 
/*  alternates  the  order  of  the  arguments  for  a  more  */ 


/*  equitable  approach  when  both  channels  are  ready  for  input  */ 

switch (toggle) { 
case  0: 
do  { 

idx  =  ProcAlt (fromlocal, fromnext,  0) ; 

)  while  (idx==-l); 
switch(idx)  ( 

case  0:  chan  =  fromlocal;  break; 

case  1:  chan  =  fromnext;  break; 

) 

toggle  =  Itoggle; 
break; 
case  1: 
do  { 

idx  =  ProcAlt (fromnext, fromlocal, 0) ; 

)  while  (idx==-l); 
switch (idx)  ( 

case  0:  chan  =  fromnext;  break; 

case  1:  chan  =  fromlocal;  break, 

} 

toggle  =  Itoggle; 
break; 

} 

msg  =  Chanlnlnt (chan) ; 
switch (msg)  ( 

case  MSG_BOOTED: 

ChanOutlnt (toprev, MSG_BOOTED) ; 
break; 

case  MSG_TEST: 

procID  =  Chanlnlnt (chan)  ; 
len  =  Chanlnlnt (chan )  ; 

ChanOutlnt (toprev,  MSG_TEST)  ; 

ChanOutlnt (toprev, procID)  ; 

ChanOutlnt (toprev,  len)  ; 
break; 

case  MSG_WORK_DONE : 

procID  =  Chanlnlnt (chan)  ; 
len  =  Chanlnlnt (chan); 
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buff  =  (char  * ) malloc {2 en) ; 

Chanln  (char.,  buff,  len)  ; 

ChanOutlnt (top rev, MSG_WORK_DONE) ; 

ChanOutlnt (toprev, procID) ; 

ChanOutlnt (toprev, len) ; 

ChanOut (toprev, buff , len) ; 

free (buff) ; 

break; 

case  MSG_RAY_DATA : 

procID  =  Chanlnlnt (chan)  ; 
len  =  Chanlnlnt (chan)  ; 
buff  =  (char  Mmalloc  (len)  ; 

Chanln (chan, buff, len) ; 

ChanOutlnt (toprev, MSG_RAY_DATA) ; 

ChanOutlnt (toprev, procID) ; 

ChanOutlnt (toprev, len) ; 

ChanOut (toprev, buff, len) ; 
free (buff) ; 
break ; 

) 

) 

} 

/*************************************************************»*****»*** 

*  This  is  the  MAIN  part  of  the  program  (MAIN  in  the  C  context) . 

*  This  is  where  all  the  process  allocation  and  prioritization  and 

*  channel  allocation  is  handled. 

*/ 

/*  Declare  channels  */ 

Channel  *tolocal,  *fromlocal;  /*  non-link  channels  */ 

Channel  *toInBuf ferO, *toInBuf f erl , *toInBuf fer2 ; 

Channel  *f romOutBuf ferO, *f romOutBuf ferl , *f romOutBuf f er2 ; 

Channel  *totimer,  *fromtimer; 

/*  Declare  processes  */ 

Process  *pthroughput,  *prender,  *pfeedback; 

Process  *pInBuf ferO, *pInBuf ferl, *pInBuf fer2; 

Process  *pOutBuf ferO, *pOutBuf f erl , *pOutBuf f er2 ; 

Process  *pstopwatch; 

extern  char  *_heapend;  /*  points  to  end  of  heap  space  */ 

extern  char  *_heapstart;  /*  points  to  start  of  heap  space  */ 

ma i n  ( )  { 

~har  chipRAM[CHIPRAMSIZE] ;  /*  allocate  memory  from  on-chip  RAM  */ 

_heapend  =  0x80101000;  /*  1  MByte  Module  */ 

/*  allocate  the  internal  channels  */ 
tolocal  =  ChanAllocO; 

fromlocal  =  ChanAllocO; 

toInBufferO  =  ChanAllocO; 

toInBufferl  =  ChanAllocO; 
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toInBuffer2  =  ChanAllocO; 
f  romOutBuf  ferO  =  ChanAllocO; 
f  romOutBuf  ferl  =  ChanAllocO; 
f  romOutBuf  fer2  =  ChanAllocO; 
totimer  =  ChanAllocO; 

fromtimer  =  ChanAllocO; 

prender  =  (Process  *)malloc (sizeof (Process) ) ; 

ProcI nit (prender, render, chipRAM, CHIPRAMSIZE, 4 , tolocal , f romlocal , to 
timer, fromtimer) ; 

/*  declare  throughput  process  */ 

pthroughput  =  ProcAlloc (throughput, STACKSIZE, 3, 

LINKOIN, LINKIOUT, tOlnBuf fer2) ; 

/*  declare  input  buffers  */ 

pInBuffer2  =  ProcAlloc (buf fer, 100, 2, toInBuf fer2,  toInBuf ferl )  ; 
plnBufferl  =  ProcAlloc (buf fer , 1 00 , 2 , toInBuf ferl , toInBuf ferO ) ; 
plnBufferO  =  ProcAlloc (buffer, 100, 2, toInBuf ferO,  tolocal)  ; 

/*  declare  output  buffers  */ 

pOutBufferO  =  ProcAlloc (buffer, 100, 2, fromlocal, f romOutBuf ferO )  ; 
pOutBufferl  =  p'rocAlloc  (buf  fer,  100, 2,  f  romOutBuf  ferO, 

f romOutBuf ferl ) ; 

pOutBuffer2  =  ProcAlloc (buf fer, 100, 2, f romOutBuf ferl, 

f romOutBuf fer2) ; 

/*  declare  feedback  process  */ 

pfeedback  =  ProcAlloc (feedback, STACKSIZE, 3, f romOutBuf fer2 , 

LINK1IN, LINKOOUT) ; 

/*  declare  stopwatch  process  */ 

pstopwatch  =  ProcAlloc (stopwatch, 1000, 2, totimer, fromtimer) ; 

/*  launch  the  processes  in  parallel  */ 

ProcRun (prender) ;  /*  low  priority  */ 

ProcToHigh ( ) ;  /*  switch  to  high  priority  */ 

ProcPar (pthroughput , pfeedback, pstopwatch, plnBuf f er 0 , 
plnBufferl , plnBuf f er2, pOutBufferO , 
pOutBufferl, pOutBuf fer 2, 0) ; 

) 
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D.  INCLUDE  FILES 

The  following  are  two  of  the  include  (.h)  files  used  for  the  ray  tracing 
algorithm.  The  first,  messagelD.h,  gives  the  message  identification  number 
definitions.  The  second,  raydefs.h,  gives  the  definitions  for  some  of  the 
constants  and  data  structures  used  throughout  the  ray  tracing  and  associated 
algorithms. 


1.  messagelD.h 


/* 

★ 

*/ 


Message  ID's  and  definitions 


/*  Messages  from  Mac  to  transputer  */ 


#define 

MSG 

_PASS 

0 

/* 

pass 

data  to  next  processor 

*/ 

#def ine 

MSG~ 

_PROCID 

1 

/* 

assign  processor  ID  number 

*/ 

♦define 

msg" 

_SSPROFILE 

2 

/* 

send 

sound  velocity  profile 

data 

*/ 

♦define 

msg" 

BPROFILE 

3 

/* 

send 

sound  velocity  profile 

data 

*/ 

♦define 

msg" 

WORK 

4 

/* 

send 

work  parameters 

*/ 

♦define 

msg" 

_BEAM 

5 

/* 

send 

beam  parameters 

*/ 

♦define 

msg" 

ANGLE 

6 

/* 

send 

work  packet 

*/ 

♦define 

msg" 

'source 

7 

/* 

send 

source  position 

*/ 

♦define 

msg" 

RECEIVER 

8 

/* 

send 

receiver  position 

*/ 

♦define 

msg" 

TEST 

9 

/* 

used 

for  testing 

*/ 

♦define 

msg" 

SCREEN 

10 

/* 

used 

send  Mac  screen  parameters 

*/ 

/*  Messages  from  transputer 

to  Mac  */ 

♦define 

MSG 

RAYJDATA 

21 

/* 

return  ray  data 

*/ 

♦define 

msg] 

_BEAM_DATA 

22 

/* 

return  beam  data 

*/ 

♦define 

msg" 

_TIME_DATA 

23 

/* 

return  timing  info 

*/ 

♦define 

msg" 

_ERROR 

999 

/* 

not  implemented 

*/ 

♦define 

msg" 

_BOOTED 

555 

/* 

tell 

host  we  booted  OK 

*/ 

2. 

raydefs.h 

♦define 

maxpoints  50 

/*  max.  number 

of  points  for 

SS  profile 

♦define 

maxbottompoints  100  /*  max.  number 

of  points  for 

bottom 

*  / 

♦define 

maxraypoints  2000 

/*  max.  number  of 

points  for  ray 

path. 

etc 

♦define 

nil  0L 

♦define 

up  -1.0 

♦define 

down  1 . 0 

♦define 

surf  refl  1  /* 

surface  reflection 

i  */ 

♦define 

bott  refl  2  /* 

bottom  reflection 

*/ 
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3  /*  turning  point 


♦define  turn_pt 


♦define 

♦define 

♦define 

♦define 

♦define 


zval  0 
plval  1 
p2val  2 
qlval  3 
q2val  4 


*/ 


/*  beam  parameter  definitions  */ 


♦define  incoherent  0 
♦define  coherent  1 
♦define  semicoherent  2 
♦define  fillbeams  0 
♦define  cervenybeams  1 


♦define  minwidthbeams  2 

♦define  PI  3.14159265 

typedef  struct! 

double  angle; 
double  a; 
double  direction; 
}  extra  stuff; 


typedef  struct { 

/* 

sound  speed  profile  data  structure  */ 

int 

npts; 

/* 

number  of  tabulated  points  */ 

double 

range; 

/* 

range  (for  range  dependency)  */ 

double 

a [maxpoints] ; 

/* 

spline  coefficients  */ 

double 

z [maxpoints] ; 

/* 

depth  values  */ 

double 

c [maxpoints] ; 

/* 

speed  of  sound  values  */ 

}  sound_jprof ile; 

typedef  struct { 

/* 

bottom  bathymetry  data  structure  */ 

int 

npts; 

/* 

number  of  tabulated  points  */ 

double 

a [maxbottompoints] 

;  /*  spline  coefficients  */ 

double  z [maxbottompoints] ;  /*  depth  values  */ 
double  r [maxbottompoints] ;  /*  range  values  */ 


(  bottom_prof ile; 


typedef  struct! 

/* 

bottom  point  data  structure  */ 

double 

z; 

/* 

depth  of  bottom  */ 

double 

g; 

/* 

gradient  of  bottom  */ 

}  bottom_point; 

typedef  struct! 

/* 

sound  speed  data  structure  */ 

double 

c; 

/* 

sound  speed  */ 

double 

g; 

/* 

sound  speed  gradient  */ 

double 

gg; 

/* 

second  derivative  of  sound  speed  */ 

}  sound_speed; 

typedef  struct! 

/* 

position  data  structure  */ 

double 

x; 

/* 

range  value  */ 

double 

y; 

/* 

azimuthal  value  (not  used)  */ 

double 

z ; 

/* 

depth  value  */ 

}  position; 
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typedef  struct} 
double 
double 
double 

x; 

y; 

z; 

/* 

source  data  structure  */ 

double 

f  requency ; 

/* 

source  frequency  */ 

}  source_posn; 

typedef  struct} 

/* 

work,  parameters  data  structure 

double 

theta  upper; 

/* 

upper  ray  angle  */ 

double 

theta_lower; 

/* 

lower  ray  angle  */ 

double 

angle  incr; 

/* 

angle  increment  */ 

int 

numRays; 

/* 

number  of  rays  to  trace  */ 

double 

scaleMax; 

/* 

max  step  size  scale  value  */ 

double 

scaleMin; 

/* 

min  step  size  scale  value  */ 

double 

tolerance; 

/* 

integration  error  tolerance  */ 

double 

bottom  tolerance; 

/* 

bottom  tolerance  value  */ 

}  work_params; 

typedef  struct { 

/*  Gauss 

beam 

parameters  data  st;  .cture  */ 

int 

radii;  /*  max. 

beam  radii  for  windowing  */ 

int 

summ;  /*  beam 

summation  method  */ 

int 

IC;  /*  beam 

initial  condition  type  */ 

}  beam_params; 

typedef  struct} 

/* 

ray  results  data  structure  */ 

position 

rayend; 

/* 

final  position  of  ray  path  */ 

double 

angle; 

/* 

final  angle  */ 

double 

time; 

/* 

travel  time  */ 

int 

surface_hits; 

/* 

number  of  surface  reflections 

int 

bottom  hits; 

/* 

number  of  bottom  reflections  * 

}  ray  result; 

typedef  struct} 

/* 

position 

ray; 

/* 

double 

angle; 

/* 

double 

time; 

/* 

}  rayjpath; 

typedef  struct} 

/* 

double 

pi; 

/* 

double 

P2; 

/* 

double 

ql; 

/* 

double 

q2; 

/* 

}  gauss_beam; 

typedef  struct} 

/* 

double 

re; 

/* 

double 

im; 

/* 

)  complex; 

typedef  struct} 

/* 

double 

r  ; 

/* 

double 

theta; 

/* 

} complexExp; 

ray  path  data  structure  */ 
ray  position  */ 
ray  angle  */ 
time  of  travel  */ 


Gauss,  beam  data  structure  */ 

real (p)  */ 

imag(p)  */ 

real(p)  */ 

imag(q)  */ 


complex  number  data  structure  * / 
real  part  */ 
imaginary  part  */ 

exponential  form  of  complex  #  */ 
magnitude  */ 
phase  */ 
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typedef  struct 

{ 

/* 

screen  coordinate  data  structure 

*  / 

int 

h; 

/* 

horizontal  position  */ 

int 

}  screen_pos; 

v; 

/* 

vertical  position  */ 

typedef  struct 

{ 

/* 

screen  parameters  data  structure 

*  / 

int 

top; 

/* 

top  of  drawing  area  */ 

int 

left; 

/* 

left  side  of  drawing  area  */ 

int 

RayPixelsV; 

/* 

number  of  vertical  pixels  */ 

int 

RayPixelsH; 

/* 

number  of  horizontal  pixels  */ 

int 

RayScaleV; 

/* 

vertical  scale  */ 

int 

RayScaleH; 

/* 

horizontal  scale  */ 

}  screen_data; 

typedef  struct 

{ 

/* 

timing  information  data  structure  */ 

int 

tl; 

/* 

ray  path  calculation  time  */ 

int 

t2  ; 

/* 

screen  coord,  computation  time  */ 

int 

t3; 

/* 

send  data  to  buffer  process  time  */ 

}  tinfo; 
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APPENDIX  B 


MACINTOSH  APPLICATION  STRUCTURE 


This  appendix  gives  the  basic  structure  of  a  typical  Macintosh  application 
main  event  loop.  For  clarity,  this  example  is  written  using  mostly  English 
statements  rather  than  C  code. 

main  ( ) 

{ 

Initialize  necessary  toolbox  routines 
do  { 

perform  a  system  task 

check  if  any  data  ready  from  transputers  (used  for  this  project 

specifically) 

check  to  see  if  an  'event'  has  occurred 

if  so,  handle  it  depending  on  the  type  of  event 
switch  (event  type) 

{ 

case  mouse  button  was  pressed 
find  out  where  it  was  pressed 
handle  it  depending  on  where  it  was  pressed 
switch  (where  pressed) 

{ 

case  in  a  Desk  Accessory 
let  DA  handle  it 
break; 

case  in  the  menu  bar 

handle  the  command  selected 
break; 

case  in  the  drag  region  of  a  window 
drag  the  window 
break; 

case  in  the  content  region  of  a  window 

do  appropriate  action  for  content  selected 
break; 

case  in  the  window' s  close  box 

track  the  mouse  and  close  window  if  necessary 
break; 

case  in  the  grow  region  of  the  window 
change  size  of  window  appropriately 
break; 

case  in  zoom  (in)  box  of  window 

track  mouse  and  zoom  window  if  necessary 
break; 
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case  in  zoom  (out)  box  of  window 

track  mouse  and  zoom  window  if  necessary 
break; 

} 

break; 

case  a  key  was  pressed 

perform  appropriate  action  for  key(s)  pressed 
break; 

case  a  window  has  been  activated 

activate  new  window  and  deactivate  old  window 
break; 

case  a  window  requires  updating 
update  contents  of  window 
break; 

} 

}  while  (!doneFlag);  /*  while  user  has  not  selected  quit  */ 
return ( 0 ) ; 
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